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Thia  ia  the  second  volume  of  the  final  technical  report  for  the 
project  "Distributed  Databaae  Control  and  Allocation,"  aponaored  by  Rome 
Air  Development  Center,  contract  number  F3 06 02-81 -C 0028.  Thia  volume 
deacribea  work  on  the  performance  analyaia  of  concurrency  control  algo** 
rithma. 

Thia  volume  ia  a  collection  of  papera  written  during  the  courae  of 
the  project,  each  paper  analysing  from  a  different  perspective  the 
results  of  the  performance  study.  It  consists  of  five  sections.  Sec¬ 
tion  I  presents  a  study  that  analytes  the  relationship  between  the  per¬ 
formance  of  the  two  phase  locking  algorithm  and  the  following  system 
parameters:  access  distribution  of  the  database,  data  granularity,  tran¬ 
saction  size  and  multiprogramming  level.  In  a  distributed  database  sys¬ 
tem,  communication  delay  is  also  a  major  factor  affecting  the  perfor¬ 
mance  of  a  concurrency  control  algorithm,  and  we  present  in  Section  II 
an  analysis  of  the  relationship  between  the  performance  of  the  two  phase 
locking  algorithm  and  the  communication  delay.  Another  important  factor 
that  affects  the  performance  of  a  concurrency  control  algorithm  is  the 
number  of  read-only  transactions  relative  to  the  number  of  write  tran¬ 
sactions  —  ratio  of  read-only  to  write  transactions.  In  Section  III  we 
present  an  analysis  of  the  relationship  betveen  the  performance  of  the 
two  phase  locking  algorithm  and  their  ratio. 

Section  IV  extends  the  analysis  to  algorithms  based  on  timestamps 
by  presenting  a  comparison  of  the  performance  of  three  distributed  con¬ 
currency  control  algorithms  --  the  Basic  Timestamp,  Multiple  Version 
Timestamp,  and  two  phase  locking  algorithm. 

In  Section  V  we  analyse  the  two  phase  locking  algorithm  in  more 
detail  and  refine  the  algorithm  into  nine  algorithms.  In  addition,  we 
reevaluate  the  previous  two  timestamp  algorithms  in  more  detail  and 
analyse  a  new  timestamp  based  algorithm  —  the  Dynamic  Timestamp  algo¬ 
rithm.  We  then  compare  the  performance  of  the  twelve  algorithms. 
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PERFORMANCE  OF  TWO  PHA8E  LOCKING* 


Wente  K.  Lin 
Jerry  Nolte 


*  Thia  paper  appeared  in  the  Proceedings  of  the  1982  Berkeley 
on  Distributed  Data  Management  and  Computer  Networks. 
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2.  Performance  of  Two  Phase  Locking 


Abstract 


Simulation  and  analytical  modeling  of  the  two  phase  locking  in  a 
DBMS  is  the  subject  of  this  study.  It  is  only  part  of  a  larger  project 
that  is  studying  the  performances  of  various  concurrency  control  and 
reliability  algorithms  in  a  distributed  DBMS.  In  the  simulation  model, 
the  application  environment  is  characterised  by  the  transaction  sise  — 
the  number  of  lockable  units  requested  by  each  transaction  —  and  the 
system  environment  by  the  number  or  transactions  running  concurrently 
(multiprogramming  level),  total  number  of  lockable  units  in  the  data¬ 
base,  and  the  distribution  of  accesses  to  these  lockable  units.  These 
environments  are  varied  for  different  simulation  runs.  Output  from 
these  simulation  runs  includes  the  probabilities  of  a  lock  request 
involved  in  a  conflict, and  deadlock  respectively  (PC  and  PD),  ana  the 
average  waiting  delay  (WT)  and  its  standard  deviation  (DV)  of  a  blocked 
lock  .request.  The  results  show  that  the  system  behaves  quite  similarly 
for  different. access  distributions  —  PC,  PD,  WT,  and  DV  all  increase 
more  than  linearly  with  the  multiprogramming  level  and  the  transaction 
sise;  the  increase  of  PC  is  faster  with  multiprogramming  level  than  with 
the  transaction  .sixe,  and  the  reverse  is  true  for  PD,  WT,  and  DV. 
Degression  analysis  on  the  simulation  results  reveals  interesting  rela¬ 
tionships  between  the  granularity  of  the  lockable  units  and  PC,  PD,  and 
WT.  Because  of  the  assumption  of  fixed  delay  (excluding  blocking  due  to 
lock  conflict)  between  two  consecutive  lock  requests  by  a  transaction, 
the  results. ap|lg  to  a  centralized  DBMS  wfth  little  10  delqy  variation. 


DBMS  with  little  communication  delay  variation. 
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2.1  Introduction 

In  the  tvo  phase  locking  protocol  as  described  in  Gray  [1],  during 
the  first  phase  transactions  accumulate  locks  incrementally ,  acquiring 
each  lock  as  its  need  arises ,  and  during  the  second  phase ,  release  each 
lock  as  soon  as  its  need  ends.  But  to  spare  the  end  users  the  responsi¬ 
bility  of  requesting  and  releasing  locks,  most  DBMSs  implement  implicit 
locking.  The  DBMSs  request  and  release  the  locks  automatically  when  the 
transactions  request  the  data  items  and  when  the  transactions  end, 
respectively.  Because  a  DBMS,  not  knowing  enough  of  the  syntax  and 
semantics  of  the  transactions,  is  ignorant  of  the  time  when  each  data 
item  is  no  longer  needed,  it  can  only  release  the  locks  held  by  a  tran¬ 
saction  when  the  transaction  ends. .  Besides,  if  locks  held  by  a  transac¬ 
tion  are  released  before  the  transaction  ends,  then  the  abortion  of  the 
transaction  causes  roll-backs  of  all  other  transactions  that  have  read 
data  released  by  the  aborted  transaction.  To  avoid  the  problems  dis¬ 
cussed  above,  most  DBMS  release  locks  held  by  a  transaction  when  the 
transaction  ends.  The  performance  of  this  modified  two-phase  locking  is 
the  subject  of  this  study. 

In  this  study  we  use  several  measures  of  system  performance.  We 
emphasize  the  blocking  and  restart  behavior  of  transactions.  We  concen¬ 
trate  on  the  basic  underlying  factors  of  conflict,  deadlock,  and  wait 
duration.  The  performance  variables  are  listed  as  follows: 

1.  the  average  probability  of  a  lock  request  conflicting  with  another 
one; 

2.  the  average  probability  of  a  lock  request  causing  a  deadlock; 

3.  the  average  waiting  delay  of  a  conflicting  lock  request; 

4.  and  the  standard  deviation  of  this  delay. 

Besides  locking  protocol,  the  performance  of  a  DBMS  depends  on  several 
system  and  application  parameters: 

1.  the  average  number  of  locks  requested  by  a  transaction  (transaction 
size); 

2.  the  maximum  number  of  transactions  running  concurrently  (the  mul¬ 
tiprogramming  level); 

3.  the  size  of  the  group  that  is  the  unit  of  locking  (lockable  unit 
size); 

4.  the  size  of  the  database  (total  number  of  lockable  units); 

5.  and  the  distribution  of  lock  requests  to  the  lockable  units  of  the 
database. 
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Two  distributions  of  lock  requests  to  the  lockeble  units  ere  simu¬ 
lated.  The  random  access  model  assumes  that  all  lockable  units  have  the 
same  probability  of  being  accessed  by  a  lock  request.  The  20/80  model 
assumes  that  20Z  of  the  database  is  accessed  80Z  of  the  time. 

Using  simulation  and  statistical  data  analysis  techniques,  this 
paper  studies  the  relationships  between  the  performance  of  a  DBMS  and 
those  system  and  application  parameters  affecting  it. 

A  few  researchers  have  attempted  similar  studies.  In  Lin  [2],  the 
same  approach  taken  in  this  study  was  used  to  evaluate  two  timestamping 
protocols,  but  its  results  could  not  be  extended  to  the  two-phase  lock¬ 
ing  protocol.  In  Naka  [3],  the  result  confirmed  that  concurrent  updat¬ 
ing  of  the  database  by  transactions  degrades  the  performance  of  a  DBMS. 
In  Spit  [4] ,  the  two  phase  locking  and  the  modified  version  (described 
above)  were  found  to  perform  equally  well  in  system-2000.  In  Mun  [5], 
deadlock  resolution  methods  were  studied,  and  three  were  found  to  be 
superior:  restarting  the  smallest,  the  one  holding  the  least  locks,  and 
the  one  having  consumed  the  least  epu  time.  In  addition,  it  was  found 
that  simultaneous  reduction  of  the  sizes  of  the  lockable  unit  and  the 
transaction  improves  the  performance.  But  the  oversimplified  definition 
of  performance  as  the  epu  utilization  made  the  results  less  useful.  In 
Bies  [6],  the  scope  and  the  objective  of  its  simulation  were  much  more 
ambit ious  than  the  previous  three.  Nevertheless,  it  emphasized  the 
effects  of  the  size  of  the  lockable  unit  on  the  performance  of  the  DBMS, 
which  was  defined  as  the  epu  and  10  utilizations,  plus  in  some  cases  the 
response  time  and  the  system  through-put.  The  main  model  required  tran¬ 
sactions  to  obtain  all  the  required  locks  before  they  started,  and  the 
request-as-needed  model  was  only  briefly  studied.  It  had  many  interest¬ 
ing  results  showing  how  the  size  of  the  lockable  unit  interacts  with  the 
system  and  application  parameters  to  effect  the  performance.  But  its 
assumption  that  the  multiprogramming  level  has  no  affect  on  performance 
is  contradicted  by  this  study.  Also,  performance  was  not  related  to 
system  and  application  parameters  as  precisely  and  quantitatively  as  in 
the  present  study. 

This  study  expands  on  Lin  [2]  and  Ries  [4],  and  presents  the 
results  in  the  same  precise  form  as  that  of  Lin  [2].  The  second  subsec- 


tioo  discusses  the  simulation  model;  the  third  subsection  presents  end 
analyzes  the  results  of  the  random  access  model;  the  fourth  subsection 
presents  and  summarizes  the  results  of  the  20/80  model;  and  the  fifth 
subsection  summarizes  the  results  of  this  study. 


2.2  Simulation  Model 

A  complete  description  of  a  simulatio  model  for  a  DBMS  must 
include  the  database,  the  transactions,  >  'input er  system,  and  the 
output  parameters. 

The  database  consists  of  DZ  (Database  siZe)  lockable  units  of  equal 
size.  The  size  of  each  lockable  unit  is  irrelevant  to  our  model.  The 
database  size  DZ  varies  among  different  simulation  runs. 

We  simulate  two  different  access  distributions  to  the  database:  the 
random  access  model  in  which  all  lockable  units  are  equally  likely  to  be 
accessed,  and  the  20/80  access  model  in  which  20Z  of  the  database  is 
accessed  80Z  of  the  time. 

All  transactions  request  only  exclusive  locks.  Within  each  simula¬ 
tion  run,  all  transactions  request  the  same  number  TZ  (Transaction  siZe) 
of  lockable  units,  but  TZ  varies  among  different  simulation  runs.  Each 
transaction  requests  its  lockable  units  sequentially,  but  different 
transactions  request  lockable  units  asynchronously.  When  a  transaction 
requests  for  a  lockable  unit,  a  random  number  is  drawn  to  select  one 
among  all  the  lockable  units  in  the  database  except  those  held  by  the 
requesting  transaction;  thus  a  transaction  never  requests  the  same  lock- 
able  unit  more  than  once.  If  the  drawn  lockable  unit  is  locked  by 
another  transaction,  the  requesting  transaction  is  queued  at  the  end  of 
a  FIFO  queue.  Otherwise,  it  sets  a  lock  on  the  drawn  lockable  unit  and 
waits  one  time  unit  before  requesting  another  lockable  unit.  Since  pro¬ 
cessing  a  lock  request  is  assumed  to  be  instantaneous,  the  simulation 
timer  is  advanced  one  unit  only  after  all  outstanding  lock  requests  have 
been  processed.  The  assumption  that  a  transaction  waits  a  unit  of  time 
(after  obtaining  a  lockable  unit)  before  requesting  another  one,  implies 
that  it  takes  one  time  unit  to  retrieve  a  lockable  unit  from  the  data¬ 
base,  to  wait  for  the  cpu,  and  to  process  it.  Each  transaction  releases 


all  ita  lockable  unite  after  its  completion  or  abortion. 

We  model  the  computer  system  at  a  high  functional  level.  Tie  cpu, 
10  devices,  and  other  hardware  components  are  invisible  n  the  simula¬ 
tion  model;  their  existence  is  implied  by  the  processing  time  required 
for  each  lockable  unit  discussed  previously.  The  system  is  a  closed 
multiprogramming  system,  i.e.,  the  number  of  transactions  running  con¬ 
currently  remains  at  a  constant  level  MP  (Multiprogramming  level);  a  new 
transaction  starts  as  soon  as  one  completes  or  aborts.  Nonetheless  MP 
varies  among  different  simulation  runs.  A  lock  request  conflicts  if  it 
requests  a  lockable  unit  already  held  by  another  transaction.  The  sys¬ 
tem  maintains  a  lock  with  a  FIFO  queue  for  each  lockable  unit  and  places 
conflicting  lock  requests  into  the -queue.  It  checks  for  deadlocks  as 
soon  as  a  lock  request  conflicts.  If  it  detects  a  deadlock,  the  tran¬ 
saction  of  the  conflicting  lock  request  aborts  and  restarts  inmediately ; 
it  restarts  with  a  new  randomly  dr '.wn  sequence  of  lock  requests.  Check¬ 
ings  of  conflicts  and  deadlocks  are  instantaneous. 

For  each  simulation  run,  the  output  includes  the  fraction  of  con¬ 
flicting  lock  requests  (which  is  the  same  as  the  probability  of  a  lock 
request  conflicting  with  another  lock  request  PC),  the  fraction  of  con¬ 
flicting  lock  requests  causing  deadlocks  (which  is  the  same  as  the  pro¬ 
bability  of  a  lock  request  causing  a  deadlock  PD),  and  the  average  wait¬ 
ing  of  a  blocked  lock  request  (WT)  and  its  standard  deviation  (DV). 

2.3  Simulation  Results  of  the  Random  Access  Model 

Sixty  four  simulations  were  run  for  4  values  of  multiprogramming 
level  (MP),  transaction  size  (TZ),  and  database  size  (DZ)  each.  The 
results  are  presented  and  analyzed  in  this  subsection  in  the  following 
order:  PC,  PD,  WT,  and  DV.  The  analysis  consists  of  three  steps:  visual 
inspection,  regression  analysis,  and  examination  of  the  regression  equa¬ 
tions. 

The  results  of  PC  are  presented  in  Figure  2.1.  The  figure  shows 
that  for  a  fixed  DZ,  PC  increases  with  both  MP  and  TZ,  and  the  increase 
is  larger  with  MP  than  with  TZ.  This  behavior  is  explained  by  the  fol¬ 
lowing  observation  during  the  simulation  runs:  the  number  of 


transactions  deadlocked  increases  faster  with  the  transaction  size  than 
with  the  multiprogramming  level.  Since  a  deadlocked  transaction  aborts 
and  releases  all  held  locks  as  soon  as  the  deadlock  occurs,  the  total 
number  of  locks  outstanding  (not  released)  increases  slower  with  the 
transaction  size  than  with  the  multiprogramming  level. 

If  a  diagonal  line  is  drawn  from  the  top  left  to  the  bottom  right 
of  each  table  in  the  figure,  each  number  below  the  line  is  always  larger 
than  the  opposite  number  across  the  line.  Assuming  DZ  is  fixed,  two 
elements  across  the  diagonal  line  represent  the  same  load  (L)  defined  as 
the  product  of  MP  and  TZ  divided  by  DZ.  For  example,  a  system  with  16 
transactions,  each  requesting  7  locks,  imposes  the  same  load  (112  lock- 
able  units)  on  the  database  as  a  .  system  with  7  transactions,  each 
requesting  16  locks.  This  line  shows  that  with  the  same  load,  the  sys¬ 
tem  with  higher  multiprogramming  level  has  higher  probability  of  con¬ 
flict  than  the  system  with  higher  transaction  size.  This  behavior  is 
explained  by  the  following  observation  during  the  simulation  runs. 
Assuming  the  load  L  and  the  database  size  DZ  are  fixed,  then  on  the 
average,  a  larger  MP  with  smaller  TZ  implies  less  deadlocks  and  more 
locks  outstanding.  Since  each  lockable  unit  has  the  same  probability  of 
being  accessed,  more  outstanding  locks  means  higher  probability  of  con¬ 
flict.  But  higher  probability  of  conflict  does  not  necessarily  means 
longer  response  time,  because  smaller  transaction  size  causes  conflict¬ 
ing  requests  to  wait  less  and  to  deadlock  less,  as  will  be  shown. 

The  differences  across  the  diagonal  line  diminish  as  the  database 
size  DZ  increases  —  that  is,  the  probability  of  conflict  (PC)  is 
approximately  proportional  to  the  load  L  when  the  load  on  the  database 
is  light,  because  increasing  the  database  size  without  increasing  the 
multiprogramming  level  or  the  transaction  size  is  equivalent  to  decreas¬ 
ing  the  load  on  the  database. 

We  applied  regression  analysis  to  the  data  in  Figure  2.1,  and  found 
equation  (2.1)  a  good  fit.  The  residuals  —  the  differences  between  the 
actual  values  and  the  values  predicted  by  the  equation  —  are  within 
2.5Z  of-  the  actual  values.  We  did  a  few  simulation  runs  with  larger 
values  of  DZ,  MP,  and  TZ,  and  found  that  the  equation  is  still  a  good 
fit  for  DZ  of  up  to  12384,  MP  of  up  to  128,  and  TZ  of  up  to  32;  but  we 


found  that  when  the  transaction  size  TZ  gets  much  larger  than  32,  the 
equation  under-estimates  the  probability  of  conflict  (PC)  substantially. 

K  “  (2.1) 

L  .  MPpg  TZ 


Next,  we  use  the  regression  equation  to  examine  the  relationship 
between  the  size  of  the  lockable  unit  and  the  probability  of  conflict. 


If  we  split  each  lockable  unit  into  k  smaller  units,  then  the  data¬ 
base  size  increases  to  k  times  its  original  size.  Because  of  the 
smaller  lockable  units,  a  transaction  must  request  store  lockable  units; 
thus  the  transaction  size  increases  to  w  (l<w<k)  times  its  original 
aize.  The  value  of  w  depends  on  how  well  the  database  is  placed  before 
the  split.  If  the  database  is  originally  well  placed,  then  all  the  data 
itema  contained  in  the  original  TZ  lockable  units  are  wanted  by  the 
tranaaction  —  no  frivolous  data  items  are  retrieved.  In  this  case, 
when  a  lockable  unit  is  split  into  k  smaller  ones,  the  transaction  size 
increases  to  k  times  its  original  size  (w-k) .  Otherwise,  if  the  data¬ 
base  is  badly  placed  before  the  split,  then  the  lockable  units  retrieved 
by  a  transaction  contain  a  lot  of  unwanted  data  items.  Thus,  after  the 
aplit,  a  tranaaction  may  request  the  same  number  of  lockable  units  and 
still  obtain  all  the  data  items  it  needs  (w«l).  In  most  cases,  however, 
w  will  be  larger  than  one  and  ssialler  than  k. 


Replacing  DZ  and  TZ  by  kOZ  and  wTZ,  equation  (2.1)  becomes  equation 

(2.2), 

PC'  -  t  x  PC  (2.2) 

where 

DZ0.28Lr  TZ0.13Lr  wl-(0.13Lw)/k 

*  "  affIIj^"35I?”T=TDT2BIw77E -  (2‘2a) 


and 

r  *  (k-w)/k. 


Setting  w  to  k,  equation  (2.2a)  becomes  (2.2b) 


Since  k  is  larger  than  one,  t  is  smaller  than  one.  Thus  smaller  lock- 
able  units  imply  a  smaller  probability  of  conflict  whenever  the  database 
is  well  placed.  But  as  we  will  show  later,  smaller  probability  of  con¬ 
flict  with  larger  transaction  size  may  result  in  a  higher  probability  of 
deadlock  and  longer  transaction  response  time.  As  L  approaches  zero, 
i.e.,  the  load  is  light,  t  approximates  one,  and  the  difference  between 
PC  and  PC'  becomes  insignificant. 

Setting  v  to  one  in  equation  (2.2a)  results  in  equation  (2.2c). 


^20.28Lr  ^^O.lSLr 

<^:iP“33i;r“;;i=T7y:28L77E 


(2.2c) 


where 


(k-l)/k 


t  *  1/k  as  L  approaches  zero. 

Equation  (2.2c)  shows  that  when  the  load  L  is  smaller  than  100K,  which 
is  within  our  simulation  range  and  is  realistic,  t  is  less  than  one. 
Therefore,  if  the  database  is  badly  placed,  smaller  lockable  units  imply 
a  smaller  probability  of  conflict.  In  this  case,  since  the  transaction 
size  remains  the  same,  a  smaller  probability  of  conflict  does  imply  a 
smaller  probability  of  deadlock  and  shorter  response  time. 

To  sum  up,  smaller  lockable  units  always  imply  smaller  probability 
of  conflict. 

The  probabilities  of  deadlock  (PD)  are  presented  in  Figure  2.2. 
Notice  that  PD  is  the  conditional  probability  of  a  lock  request  causing 
a  deadlock,  given  that  the  request  conflicts.  The  unconditional  proba¬ 
bility  of  deadlock  is  the  product  of  PC  and  PD,  which  is  presented  in 
Figure  2.3.  These  data  are  also  analyzed  in  three  steps:  visual  inspec¬ 
tion,  regression  analysis,  and  analysis  of  the  regression  equation. 

Figure  2.3  shove  that  for  a  fixed  DZ,  PD  increases  with  both  the 
multiprogramming  level  MP  and  the  transaction  size  TZ.  But  in  contrast 
to  PC,  the  increase  is  larger  with  TZ  than  with  MP. 


If  the  diagonal  line  discussed  previously  is  drsvn  for  each  table 
in  Figure  2.3,  the  number  below  the  line' is  always  smaller  t!  an  the 
corresponding  number  across  the  line,  in  sharp  contrast  to  PC  yf  Figure 
2.1.  Thus  assuming  equal  loads  L,  a  system  with  larger  transactions  and 
lower  multiprogramming  level  has  a  higher  probability  of  deadlock  than  a 
system  with  shorter  transactions  and  higher  multiprogramming  level. 

Similarly,  regression  analysis  shows  equation  (2.3)  a  good  fit  for 
the  data  of  Figure  2.3. 

0.012(MP-l)**®7“®,2^Ii  iz3.61-3.48L 
pd'  -  pdxpc  - zzivn-i-wi -  (2‘3) 

MPxIZ  “ 

DZ 

We  must  emphasize  that  PC  is  the  probability  of  deadlock  for  a  lock 
request,  not  a  transaction.  Equation  (2.3)  shows  that  when  the  load  L 
is  larger  than  80Z,  the  coefficient  c  is  smaller  than  the  coefficient  b. 
Therefore,  for  a  fixed  load  of  80Z  or  greater,  a  system  with  shorter 
transactions  and  higher  multiprogramming  level  has  a  higher  probability 
of  deadlock  than  a  system  with  longer  transactions  and  lover  multipro¬ 
gramming  level.  This  rather  surprising  behavior  is  not  imnediately 
apparent  from  inspection  of  Figure  2.3.  This  behavior  occurs  because 
when  the  load  is  high  and  transactions  are  long,  transactions  deadlock 
and  abort  frequently;  and  abortions  of  long  transactions  means  that  more 
locks  are  freed.  Thus  there  is  less  probability  of  a  lock  request  caus¬ 
ing  a  deadlock. 

To  analyze  the  relationship  between  PD  and  the  lockable  unit  size, 
we  replace  DZ  by  kDZ  and  TZ  by  vTZ,  and  equation  (2.3)  becomes  equation 
(2.4). 

PD"  -  t  x  PD' 
where 

r*n>  n0.54Lr  T73.7Lr  3.5-(3.7Lv)/k 
‘  *  — -lpntrki-.^£iLii7E- - 

and 

r  “  (1-v/k) 

Setting  v  to  k,  equation  (2.4a)  becomes  (2.4b),  which  shows  that  if  and 


(2.4) 

(2.4a) 
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only  if  the  load  L  is  less  than  one,  which  is  within  the  range  of  our 
simulation  and  is  realistic,  t  is  greater  than  one. 


t  .  k1.6(l-L) 


(2.4b) 


Thus,  when  the  database  is  well  placed,  smaller  lockable  units  imply  a 
larger  probability  of  deadlock. 

Setting  w  to  1  for  the  originally  badly  placed  system,  equation 
(2.4a)  becomes  (2.4c),  which  shows  that,  within  the  range  of  our  simula¬ 
tion,  t  is  less  than  one.  Therefore  smaller  lockable  units  reduce  the 
probability  of  deadlock. 

In  summary,  larger  lockable  units  in  a  well  placed  system  and 
smaller  lockable  units  in  a  badly  placed  system  reduce  the  probability 
of  deadlock  for  lock  requests  and  transactions. 

(MP-l)®*34Lr  TZ^.7Lr 

t  “  J£27IErkI79=TC7IE77E  (2*4c) 

where 

r  -  (1-1/k). 


The  average  waiting  times  of  a  conflicting  lock  request  are  shown 
in  Figure  2.4,  which  shows  that  the  average  waiting  of  a  conflicting 
lock  request  increases  with  the  multiprogramming  level  and  the  transac¬ 
tion  size,  and  the  increase  is  larger  with  the  transaction  size  than 
with  the  multiprogramming  level.  The  result  is  consistent  with  our 
intuition,  because  a  lock  request  blocked  by  a  long  transaction  must 
wait  until  the  long  transaction  completes  or  aborts;  and  it  takes  longer 
for  a  long  transaction  to  complete  or  abort.  Also,  if  a  similar  diago¬ 
nal  line  is  drawn  for  each  table,  the  number  above  the  line  is  always 
larger  than  the  corresponding  number  across  the  diagonal  line. 

Regression  analysis  shows  equation  (2.5)  a  good  fit  for  the  data  of 
Figure  2.4. 

0.19(HP-l)3*4(L+0,2)2_0,3  tz2.7(L+0.15)2+0.8 

WX  "  - - 52*7ITI=T>7W74=UTI* -  (2*5) 

Assuming  the  database  is  well  placed,  to  reduce  the  granularity  of  the 
lockable  units  to  1/k  of  its  original  size,  we  increase  the  database 
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sise  DZ  and  transaction  aise  TZ  to  k DZ  and  kTZ  respectively  in  equation 
(2.5),  resulting  in  equation  (2.6a).  Equation  (2.6a)  shove  that  when 
the  load  L  is  less  than  1.4,  which  is  realistic  and  within  the  range  of 
our  simulations,  smaller  lockable  units  imply  longer  waiting  for  a  con¬ 
flicting  lock  request.  The  result  is  consistent  with  the  earlier  obser¬ 
vation  —  longer  transactions  induce  longer  waiting. 

WT  -  jtl‘25“1.37(L-0.41)2  (2.6a) 


Assuming  the  database  is  badly  placed,  to  reduce  the  granularity  of 
the  lockable  units  to  1/k  of  its  original  size  we  increase  the  database 
size  DZ  to  kDZ,  but  leave  the  transaction  size  TZ  unchanged  in  equation 
(2.5),  resulting  in  equation  (2.6b).  Equation  (2.6b)  shows  that  when 
the  load  is  light  and  k  is  small,  t  is  greater  than  one  —  longer  wait¬ 
ing  for  a  conflicting  lock  request.  As  shown  earlier  this  is  because 
when  a  database  is  badly  placed  and  the  load  is  light,  reducing  the  size 
of  the  lockable  units  reduces  the  probability  of  deadlock.  With  less 
deadlocks,  more  transactions  complete  and  less  transactions  abort. 
Since  a  transaction  takes  longer  to  complete  than  to  abort,  a  blocked 
lock  request  waits  longer. 


WT  - 


DZ4.1rL(qL-0.08) 


(  2.6b) 


where 


In  summary,  whether  the  database  is  well  placed  or  badly  placed, 
smaller  lockable  units  increase  waiting  delay  for  a  blocked  lock 
request,  except  when  load  is  extremely  heavy,  the  database  is  badly 
placed,  and  the  reduction  in  lockable  unit  size  is  large. 

We  next  examined  the  standard  deviation  of  waiting  delays.  These 
results  can  be  summarized  very  simply. 

Regression  on  the  data  of  Figure  2.5  results  in  equation  2.7,  which 
shows  that  the  waiting  delay  may  be  approximated  by  an  Erlangian  distri¬ 
bution. 
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Figure  2.3 
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Figure  2.4 


2.4  Results  of  20/80  Access  Model 

The  results  of  simulating  the  20/80  access  model  are  shown  in  Fig¬ 
ures  2.6  through  2.10.  They  are  similar  to  the  results  of  the  random 
access  model  with  heavier  load.  The  reason  is  that  when  20Z  of  the  data¬ 
base  is  used  802  of  the  time,  the  same  load  of  the  random  access  model 
becomes  a  heavier  load.  The  probability  of  conflict,  the  probability  of 
deadlock,  and  the  average  waiting  of  a  conflicting  lock  request  still 
increases  with  both  the  transaction  sise  and  the  multiprogramming  level. 
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Figure  2.5 


The  probability  of  conflict  increases  faster  with  the  multiprogramming 
level  than  with  the  transaction  size,  while  the  reverse  is  true  for  the 
probability  of  deadlock  and  the  average  waiting  of  a  conflicting  lock 
request.  If  diagonal  lines  are  drawn  for  the  tables  (as  previously 
explained),  the  number  below  the  line  is  always  larger  than  the 
corresponding  number  above  the  line  for  the  probability  of  conflict,  and 
the  opposite  is  true  for  the  probability  of  deadlock  and  the  average 
waiting  of  a  conflicting  lock  request.  But  the  differences  diminish  as 
the  load  becomes  lighter. 


Applying  regression  analysis  to  data  in  Figure  2.6  results  in  equa¬ 
tion  (2.8).  Similar  to  equation  (2.1),  it  shows  that  the  coefficient  b 
is  always  larger  than  the  coefficient  c.  The  major  difference  between 
this  equation  and  equation  (2.1)  is  that  the  coefficient  a  of  equation 
(2.8)  is  equal  2.7,  much  larger  than  the  0.72  of  equation  (2.1). 


PC  - - 

where 


3  7(MP-1)*  *0®+l »51L  ^^1 • 08+0 . 58L 


"~T7I3*I739L‘ 


(2.8) 


L  ■ 


HP  x  TZ 


DZ 


To  examine  the  relationship  between  the  probability  of  conflict  and  the 
lockable  unit  size,  we  replace  TZ  by  wTZ  and  DZ  by  kDZ  in  equation 
(2.8),  and  obtain  equation  (2.9). 


V  V  * 

’  .*  *■ 

V  .N  _  . 
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PC'  -  t  x  PC  (2.9) 

where 

DZ1.39rL  1fl+(0.58Lw)/k 

1  “  ^T)T3BrE-“T73IfE--TKTT39Ew77Ti  (2.9a) 

and 

r  -  1  -  w/k. 

If  the  database  is  well  placed,  then  w  is  equal  to  k,  and  equation 
(2.9a)  beconea  equation  (2.9b),  which  shows  that  smaller  lockable  units 
reduce  the  probability  of  conflict,  consistent  with  the  result  of  the 
random  access  case. 

t  -  k“0*81  (2.9b) 

If  the  database  is  badly  placed,  then  w  is  equal  to  one,  and  equation 
(2.9a)  becomes  equation  (2.9c).  Equation  (2.9c)  shows  that  if  the  load 
L  is  less  50Z,  which  is  within  the  range  of  our  simulations  and  is  real¬ 
istic,  smaller  lockable  units  reduce  probability  of  conflict.  In  sum¬ 
mary,  whether  the  database  ia  originally  well  or  badly  placed,  reducing 
lockable  units  reduces  the  probability  of  conflict.  This  result  is  the 
same  as  in  the  random  access  model. 

1  (2.9c) 


where 

r  -  1  -  1/k. 

Regression  of  the  data  in  Figure  2.8  results  in  equation  (2.10), 
which  shows  that  when  the  load  L  is  greater  than  33Z,  the  coefficient  c 
is  smaller  than  the  coefficient  b.  Therefore,  for  a  fixed  load  of  33Z 
or  higher,  a  system  with  higher  multiprogramming  level  and  smaller  tran¬ 
sactions  has  higher  probability  of  deadlock  than  a  system  with  lower 
multiprogramming  level  and  longer  transactions.  This  result  is  similar 
to  the  random  access  model. 


PD'  -  PDxPC 


- ;;2T33=Tni3t - 


(2.10) 


where 


HP  x  TZ 


To  examine  the  relationship  between  the  probability  of  deadlock  and  the 
lockable  unit  sixe,  we  replace  TZ  by  wTZ  and  DZ  by  kDZ  in  equation 
(2.10),  and  obtain  equation  (2.11). 


PD"  -  t  x  PD' 


where 


T24.74Lr  w3.88-(4.74Lv)/k 


(^7)r55ir^T):i3K‘^2;33=TT);i3Iw77E 


(2.11) 


(2.11a) 


If  the  database  is  well  placed,  then  w  is  equal  to  k,  and  equation 
(2.11a)  becomes  equation  (2.11b).  Similar  to  equation  (2.4b),  it  shows 
that  when  the  load  L  is  less  than  34Z,  which  is  realistic  and  within  the 
range  of  our  simulations,  t  is  greater  than  one.  That  means  larger 
lockable  units  reduce  the  probability  of  deadlock.  This  result  is  simi¬ 
lar  to  the  one  found  in  the  random  access  model. 


t  -  k1.35-4.61L 


(2.11b) 


For  the  badly  placed  database,  setting  w  to  one  in  equation  (2.11a) 
results  in  equation  (2.11c),  which  shows  that,  within  the  range  of  our 
simulations,  smaller  lockable  units  reduce  the  probability  of  deadlock. 
This  result  is  also  similar  to  the  one  found  in  the  random  access  model. 


,4.74Lr 


(^;^2TBBEr““T)TT3Ir"-2-:33-n)7T3l77E 


(2.11c) 


Regression  on  the  data  in  Figure  2.9  results  in  equation  (2.12), 


0.037(MP-1)11‘7(L“°’1)2‘0*24  tz1*.8(L-0.22)2+0.25 
- ”-T3T4TI=T>T274=T)T27 - 


(2.12) 


Replacing  DZ  by  kDZ  and  TZ  by  kTZ,  equation  (2.12)  becomes  equation 
(2.13a),  which  shows,  as  does  equation  (2.6a),  that  t  is  greater  than 


.■  v 
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one  —  longer  waiting  delay  for  a  conflicting  lock  request. 
.  k0.1+1.4(L-0.4)2 


(2.1a) 


Replacing  DZ  by  kDZ,  but  leaving  TZ  unchanged,  equation  (2.12) 
becones  equation  (2.13b),  which  ahows,  aa  doea  equation  (2.6b),  that 
when  the  load  ia  light  and  k  ia  small,  t  ia  greater  than  one.  There¬ 
fore,  in  general,  reducing  the  aise  of  lockable  unite  increases  the 
waiting  delay  of  a  conflicting  lock  request,  except  when  the  load  is 
heavy,  the  database  ia  badly  placed,  and  the  reduction  of  lockable  unit 
aise  ia  large. 

WT  -  (2.13c) 

Dz13.4rL(qL-0.4) 

^^ir:7rrTqt=Tr:2r^T5*:8riTqt=u:w7‘^T3:5tE7E=Tr:27fi=Tj:27 

where 

i :  a :  ift) 

Regression  on  the  data  in  Figure  2.10  results  in  equation  (2.14). 

DV  -  -0.88  ♦  WT  (2.14) 
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2.5  Summary 

He  simulated  the  two-phase  locking  in  a  DBMS  vith  fairly  constant 
communication  and  10  delays.  He  collected  performance  data,  and 
regressed  these  data  into  equations  relating  the  performance  of  the  DBMS 
to  the  multiprogramming  level,  the  transaction  size,  and  the  database 
size.  Using  these  equations  we  examine  the  interaction  between  the  per¬ 
formance  of  a  DBMS  and  lockable  units  size. 


We  found  the  performance  behavior  of  a  DBMS  vith  random  database 
access  distribution  quite  similar  to  that  of  the  20/80  access  distribu¬ 
tion  —  the  20/80  system  behaves  as  a  random  access  system  in  heavy 
load.  In  fact,  the  same  regression  models  (equations)  vith  different 
coefficient  values  fit  both  access  models  veil  except  for  the  standard 
deviation  of  the  lock  request  vaiting  delay. 

The  probability  of  conflict  of  a  lock  request  increases  more  than 
linearly  vith  the  multiprogramming  level  and  the  transaction  size;  the 
increase  is  larger  vith  the  multiprogramming  level  than  vith  the  tran¬ 
saction.  The  probability  of  deadlock,  the  average  vaiting,  and  its 
standard  deviation  of  a  conflicting  lock  request  also  increase  more  than 
linearly  vith  the  multiprogramming  level  and  the  transaction  size.  But 
the  increase  is  smaller  vith  the  multiprogramming  level  than  vith  the 
transaction  size. 

The  vaiting  delay  of  a  conflicting  lock  request  can  be  approximated 
by  an  Erlangian  distribution  in  the  random  access  model.  This  result 
can  be  extremely  useful  for  researchers  vho  use  queueing  theory  to  model 
a  DBMS. 

The  results  of  this  study  have  been  validated,  and  can  be  extrapo¬ 
lated  for  database  size  of  up  to  12384,  multiprogramming  level  of  up  to 
128,  and  transaction  size  of  up  to  32. 

So  far  ve  have  concentrated  on  the  basic  factors  of  PC,  PD,  WT,  and 
DV.  We  vill  next  briefly  discuss  the  combination  of  these  blocking  and 
restart  variables  into  system  throughput,  a  measure  of  performance  vhich 
is  more  directly  useful  to  a  system  designer. 

In  the  highly  functional  model  used  here,  all  system  resources  are 
represented  by  the  time  to  process  lock  requests.  Since  each  request 
consumes  the  same  time,  ve  measure  throughput  by  number  of  lock  requests 
processed  by  transactions  vhich  finish. 

In  every  case,  throughput  decreases  vith  increasing  TZ,  if  MP  and 
DZ  are  held  constant.  As  noted  above,  for  longer  transactions  there  are 
more  conflicts,  more  deadlocks,  and  longer  delays.  The  message  for 
applications  program  design  is  clear.  Transactions  should  be  made  as 
small  as  possible. 


Also,  throughput  increases  with  increasing  DZ  if  MP  and  TZ  are  held 
constant.  This  is  the  "badly  placed  locks"  case,  and  it  al*iO  can  be 
anticipated  from  the  analysis  above.  For  random  access  oi  d-’ta,  small 
granules  will  provide  better  throughput  when  both  blocking  and  restart 
behavior  are  considered.  However,  because  of  the  increasing  communica¬ 
tions  and  processing  costs  of  lock  management,  the  response  time  will 
increase.  The  optimal  granularity  can  be  calculated  from  the  regression 
equations. 

Finally,  throughput  first  increases,  and  then  decreases  with 
increased  MP  if  TZ  and  DZ  are  constant.  Given  a  particular  granularity 
and  transaction  size,  for  light  loads,  significant  gainB  in  throughput 
can  be  attained  by  increasing  the  multiprogramming  level.  However,  as 
the  system  load  becomes  heavier,  the  losses  to  deadlock  and  restart  more 
than  outweigh  the  gains  from  increased  concurrency. 
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3.1  Introduction 

Many  distributed  concurrency  control  algorithms  have  been  proposed 
(Bad[l] ,  Ber [ 1 ] ,  Ell[l],  Gar[l],  Linl4],  Rosll],  StellJ,  Stoll], 
Tholl]).  But  how  well  do  they  perform?  A  few  researchers  have 
attempted  to  compare  the  performance  of  different  algorithms  (Garlll, 
Linll],  Lin[2],  Riell],  Thall]).  Unfortunately  they  all  used  different 
assumptions  about  system  and  application  parameters.  Furthermore,  they 
compared  different  algorithms  using  different  performance  measures.  For 
example,  in  Lin[l],  two  timestamping  algorithms  are  compared,  and  the 
performance  measure  used  is  the  average  response  time. 

In  Rie[l] ,  two  two-phase  locking  algorithms  —  the  centralized 
method  and  the  primary  copy  method  —  are  compared.  Performance  meas¬ 
ures  include  utilization  of  devices,  average  transaction  response  time, 
and  others.  In  addition,  a  transaction  must  obtain  all  its  locks  before 
it  can  start,  no  duplicate  data  is  allowed,  and  multiprogramming  level 
is  assumed  to  have  no  effect  on  system  performance. 

In  Gar[l],  three  two-phase  locking  algorithms  are  compared  —  the* 
centralized  method,  the  voting  method,  and  the  ring  method.  Both  utili¬ 
zation  of  devices  and  average  response  time  are  performance  measures.  A 
few  major  assumptions  are  made:  multiprogramming  level  is  one  at  each 
node,  a  transaction  must  obtain  all  locks  before  it  proceeds,  and  a 
transaction  requests  all  update  locks  in  parallel.  In  addition,  the 
main  results  apply  to  a  fully  replicated  system. 

In  Lin[2],  distributed  two-phase  locking  algorithms  are  abstracted 
and  encapsulated  in  one  model,  and  the  relation  between  system  blocking 
behavior  (conflict  and  deadlock)  and  various  system  and  application 
parameters  is  studied  in  detail.  Two  access  distributions  of  the  data¬ 
base  are  simulated.  The  first  one  has  a  uniform  distribution  —  every 
data  granule  is  equally  likely  to  be  accessed  by  a  lock  request;  the 
second  one  has  80Z  of  the  database  accessed  by  20Z  of  the  lock  requests. 
It  concludes  that  the  more  concentrated  database  access  distribution  has 
the  same  effect  on  system  blocking  behavior  as  the  uniform  distribution 
in  heavier  load. 


In  Tha[l],  two  tvo-phaae  locking  algorithms  are  simulated  —  basic 
tvo-phase  and  centralised  two-phase.  The  performance  measure  ased  is 
the  average  transaction  response  time.  The  simulation  trodel  includes 
many  system  and  application  parameters,  thus  necessitating  a  large 
number  of  simulation  runs.  But  only  the  results  of  a  few  simulation 
runs  with  limited  values  of' the^e  parameters  are  presented. 

These  performance  studies  are  very  difficult  to  compare,  and  it  is 
almost  impossible  to  integrate  their  results.  They  compare  different 
algorithms,  they  stake  different  assumptions  about  system  and  application 
environments,  and  they  employ  different  measures  for  system  performance. 
This  paper  is  part  of  a  major  effort  to  compare  the  principal  distinct 
concurrency  control  algorithms,  using  the  same  performance  steasures  and 
the  same  assumptions  that  are  consistent  vith  various  system  and  appli¬ 
cation  environsients . 

This  paper  reports  part  of  the  results  of  simulation  on  two-phase 
locking.  Section  3.2  describes  the  simulation  model,  Section  3.3 
discusses  the  simulation  results,  and  Section  3.4  concludes  the  study. 

3.2  The  Simulation  Model 

Two  phase  locking  causes  blocking  and  deadlocks  among  transactions. 
Blocking  occurs  because  two  or  more  transactions  may  request  the  same 
data  item  at  the  same  time.  Blocking  degradates  system  performance 
because  a  blocked  lock  request  must  wait  for  the  blocking  transaction  to 
complete  or  abort.  This  waiting  is  called  blocking  delay.  Deadlock 
occurs  when  two  or  more  transactions  directly  or  indirectly  block  each 
other.  Deadlock  also  degradates  system  performance,  becuase  a  deadlock 
causes  a  partially  completed  transaction  to  abort  and  restart. 

Transaction  blocking  and  restarting  are  affected  by  many  system  and 
application  characteristics.  These  include  average  transaction  size 
(number  of  locks  requested  by  a  transaction),  multiprogramming  level 
(number  of  transactions  running  concurrently),  database  sizt  (number  of 
locking  granules),  access  distribution  of  the  database  (probability  of 
each  data  granule  being  accessed  by  a  lock  request),  frequency  of  local 
and  remote  requests,  locking  granularity,  communication  network,  10 


devices,  memory  size,  CFO  speed,  end  others*  Thus  to  accurately  evalu¬ 
ate  the'  restarting  and  blocking  behavior  of  the  tvo  phase  locking,  ve 
aust  include  all  these  factors  in  the  simulation  model,  and  this  is  too 
expensive  to  do  directly. 

To  simplify  the  simulation,  ve  model  the  system  and  transactions  in 
a  highly  functional  model.  Much  of  the  detail  of  a  real  distributed 
system  is  captured  in  a  fev  parameters,  which  are  used  as  inputs  to  the 
simulation  siodel.  This  approach  permits  us  to  greatly  reduce  the  number 
of  sisulation  runs  necessary,  and  also  to  reduce  the  complexity  of  the 
model,  while  retaining  most  of  the  impact  that  these  details  have  on  the 
performance  of  the  concurrency  control  algorithms. 

Ve  siodel  a  transaction  as  a  sequence  of  lock  requests.  Between  two 
consecutive  lock  requests,  a  transaction  incurs  two  kinds  of  delays: 
blocking  delay  and  communication  delay.  The  communication  delay 
includes  communication  network  delay,  10  delay,  and  CPU  processing 
delay;  it  is  called  communication  delay  because  in  a  distributed  system 
it  is  likely  that  the  communication  network  delay  dominates  the  10  and 
the  CPU  delays.  The  communication  delay  is  an  input  parameter  to  our 
simulation  model,  while  the  blocking  delay  is  an  output  parameter. 
Thus,  the  blocking  delay  is  measured  as  a  function  of  the  communication 
delay. 

For  each  simulation  run,  ve  assume  the  communication  delay  to  have 
certain  probability  distribution,  but  ve  vary  the  probability  distribu¬ 
tion  for  different  simulation  runs.  Ve  use  only  hypo-exponential  and 
hyper-exponential  distributions;  therefore  each  distribution  can  be 
characterized  by  its  average  and  standard  deviation. 

Since  the  communication  delay,  consisting  of  communication  network 
delay,  10  delay,  and  CPU  delay,  is  an  input  parameter  and  is  modeled  by 
an  abstract  probability  distribution  function,  ve  make  no  assumptions 
about  the  characteristics  of  the  underlying  communication  network,  10 
devices,  and  CPUs,  and  their  relative  performances.  In  fact,  communica¬ 
tion  networks,  10  devices,  and  CPUs  of  various  performance  characteris¬ 
tics  are  modeled  by  different  probability  distribution  functions.  For 
example,  a  high  bandwidth  and  lightly  loaded  system  has  small  variation 
in  communication  delay,  thus  it  can  be  modeled  by  a  distribution 


function  with  ■■all  standard  deviation,  while  a  low  bandwidth  and 
heavily  loaded  system  can  be  modeled  by  a  distribution  function  with 
large  standard  deviation.  Hotice  that  the  average  coaaunication  delay 
is  used  as  the  simulation  time  unit;  therefore  the  average  communication 
delay  is  not  a  factor  in  the  simulation  model.  Thus  the  simulation 
results,  especially  the  blocking  delay,  «uat  be  scaled  according  to  the 
actual  average  communication  delay. 

Besides  communication  delay,  input  parameters  of  the  simulation 
model  include  average  transaction  site,  multiprogramming  level,  database 
size,  and  ratio  of  read-only  transactions  to  update-only  transactions 
entering  the  system.  Besides  blocking  delay,  performance  measures  (out¬ 
put  parameters)  include  the  probability  of  conflict  of  lock  requests, 
the  probability  of  deadlock  and  restart  of  lock  requests,  and  the  number 
of  locks  held  by  a  transaction  when  the  transaction  deadlocks  and  res¬ 
tarts  . 

We  did  not  explicitly  include  the  frequency  of  local  and  remote 
data  requests  as  an  input  parameter  because  it  is  captured  by  the  proba¬ 
bilistic  distribution  of  the  communication  delay.  For  example,  a  system 
with  mostly  local  data  requests  can  be  modelled  by  a  distribution  func¬ 
tion  with  small  mean  value.  Neither  did  we  include  locking  granularity 
as  an  input  parameter  because  locking  granularity  is  a  function  of  the 
database  size  and  the  transaction  size;  increasing  the  granularity  is 
equivalent  to  decreasing  the  database  size  and  the  transaction  size. 
Moreover,  we  simulated  only  random  access  to  the  database  —  every  lock¬ 
ing  granule  has  the  same  probability  of  being  accessed  by  a  lock  request 
—  because  our  previous  study  (Lin[2j)  showed  that  more  concentrated 
access  distributions  had  the  same  results  as  the  random  access  distribu¬ 
tion  in  heavier  load.  The  input  parameters  of  the  simulation  model  are 
discussed  further  in  the  resiainder  of  the  section. 

For  a  database  size  (DZ)  of  N,  N  locks  and  N  queues  for  the  locks 
are  simulated.  Deadlock  can  occur,  and  the  transaction  in  the  deadlock 
cycle  that  holds  the  least  number  of  locks  aborts  and  restarts  ianedi- 
ately.  -The  reason  we  choose  this  particular  transaction  to  abort  is 
that  our  previous  study  (Lin[3])  concludes  that  this  deadlock  resolution 
algorithm  performs  best  in  all  system  and  application  environments. 


Transaction  size  (TZ)  is  assumed  to  be  exponentially  distributed. 
The  average  of  the  distribution  varies  among  different  simulation  runs, 
but  remains  fixed  vithin  a  simulation  run.  A  transaction  requests  locks 
sequentially,  but  different  transactions  request  locks  asynchronously. 
This  model  is  general  enough  to  include  all  transaction  types  of 
interest.  For  example,  to  model  transactions  in  which  read  requests  and 
update  requests  respectively  are  issued  in  parallel  only  once,  the  tran¬ 
saction  size  can  be  set  to  two. 

After  a  lock  request  is  granted,  a  transaction  waits  for  a  period 
of  time  before  requesting  another  lock.  The  period  of  time  is  the  com¬ 
munication  delay  discussed  previously.  The  average  of  the  communication 
delay  is  fixed  at  one  for  all  simulation  runs,  but  the  standard  devia¬ 
tion  varies  among  different  simulation  runs.  The  simulation  results  can 
easily  be  scaled  to  whatever  the  actual  average  of  the  communication 
delay  may  be. 

For  a  simulation  run,  the  multiprogramming  level  (MP)  is  fixed; 
thus  the  model  is  closed,  and  a  new  transaction  is  generated  and  started 
only  after  one  completes.  The  results  of  the  simulation  are  presented 
in  the  next  section. 

3.3  Simulation  Results 

We  simulated  three  different  distributions  of  communication  delay. 
All  are  erlangian  and  have  the  same  average  delay  of  one  time  unit;  one 
has  a  standard  deviation  of  0.368,  the  second  1.87,  and  the  third  5.28. 
For  each  standard  deviation  of  communication  delay,  we  simulated  three 
different  multiprogramming  levels,  three  average  transaction  sizes,  and 
three  database  sizes,  for  a  total  of  81  system  configurations.  Figures 
3.1  through  3.8  shov  the  results. 

From  the  results  we  can  conclude  that  the  standard  deviation  of 
communication  delay  has  no  effect  on  the  probability  of  conflict  and  the 
probability  of  deadlock  of  a  lock  request,  or  on  the  number  of  locks 
held  when  a  transaction  deadlocks.  But  it  does  have  an  effect  on  the 
average  waiting  time  of  blocked  lock  requests  (blocking  delay)  —  the 
larger  the  standard  deviation,  the  longer  the  average  waiting.  We 


discus*  the  detsils  of  these  observations  in  the  rest  of  the  sertion. 

Kech  point  of  Figure  3.1  represents  the  probabilities  of  conflict 
(PC)  of  tvo  system  configurstions  with  seme  multiprog^  jeulng  level, 
everege  trsnsection  size,  end  detebese  size,  but  with  different  stenderd 
devistion  of  communicetion  deley  (DEV).  One  hes  e  stenderd  devietion  of 
0.368,  end  the  other  of  S.28.  The  X-coordinate  of  the  point  represents 
the  probebility  of  conflict  of  the  former  conf iguretion,  while  the  Y- 
coordinete  represents  the  probebility  of  conflict  of  the  letter  confi- 
guretion.  From  the  figure,  we  cen  see  thet  ell  points  lie  very  close  to 
the  diegonsl  line  —  implying  thet  two  system  configurstions  with  widely 
different  stenderd  devietions  of  communicetion  deley  hsve  the  seme  pro- 
bebility  of  conflict.  Thus  we  cen  .conclude  thet  the  stenderd  devietion 
of  communicetion  deley  hes  no  effect  on  the  probebility  of  s  lock 
request  conflicting  with  enother  lock  request.  The  reeson  is  thet  the 
probebility  of  conflict  of  e  lock  request  depends  only  on  the  totel 
number  of  locks  outstsnding  in  the  system,  which  our  simuletion  results 
show  to  be  independent  of  the  stenderd  devietion  of  communicetion  delay. 
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Figure  3.1  PC  (DEV-5.28)  Vs  PC  (DEV-.368) 


Similar  to  Figure  3.1,  Figure  3.2  represents  the  probability  of 
deadlock  end  abortion  of  e  lock  request  for  the  tvo  different  standard 
devietions  of  communication  deley.  From  the  figure,  we  can  also  con- 
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Figure  3.2  PD  (DEV-5.28)  Vs  PD  (DEV-.368) 

elude  that  the  standard  deviation  of  communication  delay  haB  no  effect 
on  the  probability  of  deadlock  and  abortion  of  a  lock  request. 

The  reason  is  that  the  probability  of  deadlock  of  a  lock  request 
depends  on  the  total  number  of  locks  outstanding  in  the  system,  and  on 
hov  these  locks  are  distributed  among  transactions.  Our  simulation 
results  show  that  these  two  factors  are  independent  of  the  standard 
deviation  of  communication  delay. 

Figures  3.3a  through  3.3c  plot,  for  average  transaction  sire  of  4, 
16,  and  32  respectively,  the  average  waiting  delay  of  blocked  lock 
requests  against  the  standard  deviation  of  communication  delay  for  vari¬ 
ous  system  configurations.  They  show  that  the  average  waiting  delay 
increases  linearly  with  the  standard  deviation  of  communication  delay 
when  the  average  transaction  size  (TZ)  and  the  multiprogramming  level 
(HP)  are  small.  Otherwise  the  increase  is  less  than  linear  with  commun¬ 
ication  delay  variation.  In  fact,  when  the  average  transaction  is 
larger  than  32,  the  variation  has  little  effect  on  the  waiting  delay. 
This  can  be  explained  by  the  following  example.  If  all  the  transactions 
in  the  system  are  small,  and  the  standard  deviation  of  communication 
delay  is  large,  then  the  time  required  by  each  transaction  to  complete 
varies  greatly,  say  from  10  time  units  to  35  time  units,  with  an  average 


Figure  3.3a  Average  Blocking  Delay  when  TZ-4 


Figure  3.3b  Average  Blocking  Delay  when  TZ*16 

of  IS  tine  unite.  Since  blocked  requests  tend  to  wait  for  transactions 
that  have  been  in  the  system  longer,  these  blocking  transactions  tend  to 
take  from  15  to  35  tine  units  to  complete,  vith  an  average  of  25  time 
units.  But  if  the  standard  deviation  of  communication  delay  is  small. 
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then  the  time  required  by  a  transaction  to  complete  varies  less,  say 
from  10  to  20  time  units  (with  the  same  average  of  15  time  units).  But 
the  blocked  requests  most  likely  wait  for  transactions  that  require  from 


15  to  20  time  units  to  complete,  with  an  average  of  17  time  units,  which 
is  much  smaller  than  the  25  time  units  of  the  previous  case.  Therefore 
the  blocked  requests  in  a  configuration  with  larger  standard  deviation 
of  communication  delay  tend  to  wait  longer  if  the  average  transaction 
size  is  small. 

If  all  transactions  are  large,  then  the  time  required  by  a  transac¬ 
tion  to  complete  varies  little  with  the  standard  deviation  of  communica¬ 
tion  delay.  The  communication  delay  between  two  consecutive  lock 
requests  by  the  same  transaction  siay  vary  widely  if  the  standard  devia¬ 
tion  of  communication  delay  is  large,  but  since  each  transaction 
requests  many  locks,  these  variations  eventually  average  out  within  each 
transaction.  Therefore  the  average  waiting  delay  of  blocked  requests  is 
relatively  invariant  with  the  standard  deviation  of  communication  delay 
if  the  average  transaction  size  is  large. 


In  Figure  3.4,  the  X-coordinate  and  the  Y-coordinate  of  each  point 
represent  respectively  the  average  wait  and  the  standard  deviation  of 
the  wait  of  blocked  lock  requests  of  a  system  configuration.  These  58 


point*  represent  the  81  runs  with  some  runs  overlapping  on  the  same 
points.  The  figure  shows  that  the  standard  deviation  o£  the  waiting 
delays  of  blocked  requests  is  a  fixed  ratio  of  the  average  waiting  delay 
regardless  of  multiprogramming  level,  average  transaction  ii:  ,  database 
size,  and  communication  delay  variation.  This  observation  implies  that 
the  waiting  delay  of  blocked  lock  requests  may  have  i  fixed  distribution 
function  regardless  of  the  system  configurations;  but  the  average  of  the 
distribution  function  is  dependent  on  the  system  configurations  as  Fig¬ 
ure  3.3a  through  3.3c  show.  We  plotted  the  distribution  functions  for 
some  of  the  configurations,  and  all  of  them  look  similar  to  the  one 
shown  in  Figure  3.3,  which  closely  approximates  2-stage  hypoexponential 
(Erlangian)  distribution  function.  In  fact,  the  fixed  ratio  of  Figure 
3.4  (the  slope  of  the  line)  approximates  the  standard  deviation  of  a  2- 
stage  hypoexponential  distribution. 
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Figure  3.4  Average  Vs  Standard  Deviation 
of  Blocking  Delay 

For  each  system  configuration,  we  compute  the  average  number  of 
locks  held  by  transactions  when  they  deadlock  and  abort,  denoted  by 
LOCKS_AT_DEADLOCK .  Each  point  of  Figure  3.6  represents  the  average 
LOCKS_AT_DEADLOCKs  of  tvo  system  configurations  with  same  multiprogram¬ 
ming  level,  average  transaction  size,  and  database  size,  but  with  dif¬ 
ferent  standard  deviation  of  communication  delay;  one  has  standard 
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Figure  3.7  Average  Vb  Standard  Deviation  of 
LOCK  S_AT_DEADLOCK 

two  configurations  with  widely  different  standard  deviations  of  communi¬ 
cation  delay  have  the  same  LOCKS _AT_DEADLOCK .  Thus  we  can  conclude  that 
the  communication  delay  variation  has  no  effect  on  the  number  of  locks 
held  by  a  transaction  when  the  transaction  deadlocks  and  aborts. 

For  each  system  configuration,  the  standard  deviation  of  the  number 
of  locks  held  by  transactions  when  they  deadlock  and  abort,  represented 
by  DEV_LOCKS_AT_DEADLOCK  is  plotted  against  the  LOCKS_ATJDEADLOCK  in 
Figure  3.7.  This  plot  approximates  a  straight  line,  indicating  that 
DEV_LOCKS_AT_DEADLOCK  may  be  a  fixed  ratio  of  LOCKS_AT_DEADLOCK . 
Regression  analysis  indicates  the  ratio  is  about  0.70,  implying  that  the 
number  of  locks  held  at  deadlock  may  have  a  2-order  negative  binomial 
distribution.  This  distribution  is  obtained  by  throwing  a  biased  coin 
repeatedly  until  we  obtain  the  second  success.  The  number  of  throws 
represents  the  number  of  locks  held  by  a  transaction  when  the  transac¬ 
tion  deadlocks  and  aborts.  The  mean  of  the  distribution  function 
depends  on  the  bias  of  the  coin,  and  the  bias  of  the  coin  depends  on  the 
multiprogramming  level,  the  average  transaction  size,  and  the  database 
size.  We  plotted  the  distribution  functions  of  a  few  system  configura¬ 
tions  and  found  all  of  them  looking  like  the  one  shown  in  Figure  3.8, 
which  closely  approximates  a  2-order  negative  binomial  distribution. 


Figure  3.8  Distribution  of  LOCKS_AT_DEADLOCK 
3.4  Conclusion 

In  summary,  ve  can  conclude  that  communication  delay  variation  has 
no  effect  on  the  chance  of  a  lock  request  conflicting  or  deadlocking 
with  another  lock  request.  It  also  has  no  effect  on  the  number  of  locks 
held  by  a  transaction  when  the  transaction  deadlocks  and  aborts.  In 
fact,  the  distribution  function  of  the  number  of  locks  held  by  a  tran¬ 
saction  when  it  deadlocks  and  aborts  has  a  2-order  negative  binomial 
distribution  with  its  average  and  standard  deviation  independent  of  the 
communication  delay  variation. 

The  blocking  delay  of  a  blocked  lock  request  has  a  2-stage  Erlan- 
gian  distribution  regardless  of  the  standard  deviation  of  communication 
delay.  The  mean  of  the  distribution  is  also  independent  of  the  standard 
deviation  of  communication  delay,  if  the  average  transaction  size  is 
large.  However,  if  the  average  transaction  size  is  small,  the  mean  of 
the  distribution  function  depends  on  the  standard  deviation  of  communi¬ 
cation  delay  —  the  larger  the  variation,  the  longer  the  average  block¬ 
ing  delay.  But  in  many  cases,  conflict  occurs  rarely.  Therefore  its 
effect  off  system  performance  is  insignificant. 


These  results  are  important  to  performance  modeling  of  distributed 
concurrency  control*  because  they  eliminate  the  standard  deviation  of 
communication  delay  as  one  of  the  system  parameters  that  affect  system 
performance.  For  an  analytical  model*  this  means  that  the  communication 
delay  can  be  assumed  to  have  an  exponential  distribution  which  simpli¬ 
fies  the  model.  For  a  simulation  model*  this  means  geometric  reduction 
of  the  number  of  simulation  runs. 
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4.  Read  Only  Transactions  and  Two  Phase  Locking 


Abstract 

Intuition  tells  us  that  in  a  distributed  DBMS  using  two  phase  locking, 
the  ratio  (denoted  by  R/W)  of  read-only  to  update  transactions  affects 
system  performance  —  the  higher  the  ratio,  the  better  the  perfor¬ 
mance.  Read-only  transactions  only  request  share  locks,  and  thus  should 
cause  fever  conflicts  and  deadlocks  among  all  transactions.  Therefore 
both  read-only  and  update  transactions  are  expected  to  perform  better  if 
R/W  is  higher.  This  paper  reports  the  results  of  a  study  contradicting 
this  intuition,  and  discusses  the  relationship  between  the  R/W  ratio  ana 
system  performance  in  detail. 


4.1  Introduction 


Many  distributed  concurrency  control  algorithms  have  been  proposed 
(Badll],  Ber [1] ,  EllllJ,  Garll],  Lin[4],  Rosll],  Stell',  Sto[ll, 
Tho[l]).  But  how  veil  do  they  perform?  ▲  few  researchers  have 
attempted  to  compare  the  performance  of  different  algorithms  (Garlll, 
Lin[l],  Lin[2],  Mun[l],  Rie[l],  Tha[lJ).  Unfortunately  they  all  used 
different  assumptions  about  system  and  application  parameters.  Further¬ 
more,  they  compared  different  algorithms  using  different  performance 
measures.  For  example,  in  Lin[l],  tvo  timestamping  algorithms  are  com¬ 
pared,  and  the  performance  measure  used  is  the  average  response  time. 

InRiell],  tvo  tvo-phase  locking  algorithms  --  the  centralized 
method  and  the  primary  copy  method  —  are  compared.  Performance  meas¬ 
ures  include  utilization  of  devices,  average  transaction  response  time, 
and  others.  In  addition,  a  transaction  must  obtain  all  its  locks  before 
it  can  start,  no  duplicate  data  is  alloved,  and  multiprogramming  level 
is  assumed  to  have  no  effect  on  system  performance. 

In  Gar[l],  three  tvo-phase  locking  algorithms  are  compared  —  the 
centralized  method,  the  voting  method,  and  the  ring  method.  Both  utili¬ 
zation  of  devices  and  average  response  time  are  performance  measures.  A 
few  assumptions  are  made:  all  transactions  are  update  transactions;  mul¬ 
tiprogramming  level  is  one  at  each  node;  a  transaction  must  obtain  all 
locks  before  it  proceeds;  a  transaction  requests  all  locks  in  parallel; 
and  the  database  is  fully  duplicated  in  every  site  (performance  of  par¬ 
titioned  database  is  treated  briefly). 

In  Lin[2],  distributed  tvo-phase  locking  algorithms  are  abstracted 
into  one  model,  and  the  relation  betveen  system  blocking  behavior  (con¬ 
flict  and  deadlock)  and  various  system  and  application  parameters  is 
studied.  Tvo  access  distributions  of  the  database  are  simulated.  The 
first  one  has  a  uniform  distribution:  every  data  granule  is  equally 
likely  to  accessed  by  a  lock  request;  the  second  one  has  20Z  of  the 
database  accessed  by  80Z  of  the  lock  requests.  It  concludes  that  the 
more  concentrated  database  access  distribution  has  the  same  effect  on 
system  blocking  behavior  as  the  uniform  distribution  in  heavier  load. 


Ia  Tha[l],  two  two-phase  locking  algorithms  are  simulated  —  basic 
two-phase  and  centralized  two-phase.  The  performance  measure  used  is 
the  average  transaction  response  time.  The  simulation  model  includes 
many  system  and  application  parameters,  but  results  of  only  a  few  simu¬ 
lation  runs  with  limited  values  of  these  parameters  are  presented. 

These  performance  studies  are  very  difficult  to  compare,  and  it  is 
almost  impossible  to  integrate  their  results.  They  compare  different 
algorithms,  and  they  use  different  assumptions  about  system  and  applica¬ 
tion  environments  and  different  measures  for  system  performance. 
Therefore  we  began  a  major  project  in  order  to  compare  the  principal 
distinct  distributed  concurrency  control  and  reliability  algorithms, 
using  the  same  model,  same  assumptions,  same  performance  (output)  param¬ 
eters,  and  the  same  system  and  application  (input)  parameters.  This 
paper  reports  part  of  the  findings  of  this  project.  In  particular,  this 
paper  reports  our  findings  about  the  relationship  between  read-only 
transactions  and  the  performance  of  two  phase  locking.  We  found  that, 
when  the  ratio  of  read-only  transactions  to  update  transactions 
increases  from  1/3  to  3/1,  the  response  times  of  both  read-only  and 
update  transactions  and  total  system  through-put  remain  unchanged, 
except  when  the  system  load  is  extremely  heavy  and  transactions  are 
long. 

This  paper  is  organized  as  follows.  Section  4.2  describes  the 
simulation  model.  Section  4.3  discusses  the  simulation  results,  and  Sec¬ 
tion  4.4  concludes  the  study. 

4.2  The  Simulation  Model 

Because  this  paper  reports  part  of  the  findings  of  a  larger  pro¬ 
ject,  we  describe  the  simulation  model  of  the  larger  project  first.  We 
model  a  transaction  as  a  sequence  of  lock  requests.  A  lock  request 
incurs  two  kind  of  delays.  The  first,  called  blocking  delay,  occurs 
because  of  lock  conflict  (two  requests  ask  for  the  same  lock).  The 
second  delay,  called  communication  delay,  consists  of  communication  net¬ 
work  delay,  10  delay,  CPU  processing  delay,  and  others. 
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Blocking  delay  and  communication  delay  are  affected  by  nany  fac¬ 
tors,  in  additional  to  the  R/W  ratio.  These  include  average  transaction 
sise  (number  of  locks  requested  by  a  transaction),  multiprogramming 
level  (number  of  transactions  running  concurrently),  database  size 
(number  of  data  granules),  access  distribution  of  the  database  (proba¬ 
bility  of  each  locking  granule  being  accessed  by  a  lock  request),  com¬ 
munication  network,  10  devices,  memory  size,  CPU  speed,  and  others 
(locking  granularity  is  not  explicitly  considered  because  it  is  a  func¬ 
tion  of  the  transaction  size  and  the  database  size).  Thus  to  accurately 
evaluate  these  two  delays,  we  must  include  all  these  factors  in  the 
simulation  model,  and  this  is  too  expensive  to  do  directly. 

To  simplify  the  simulation,  we  divided  the  simulation  into  two 
steps.  During  the  first  step,  we  considered  the  communication  delay  as 
one  of  the  input  parameters,  and  the  blocking  delay  as  one  of  the  output 
parameters  (performance  measures);  thus  the  blocking  delay  is  measured 
as  a  function  of  the  communication  delay.  For  each  simulation  run,  we 
assumed  the  communication  delay  to  have  certain  probability  distribu¬ 
tion,  but  we  varied  the  probability  distribution  for  different  simula¬ 
tion  runs.  He  used  only  hypo-exponential  and  hyper-exponential  distri¬ 
butions;  therefore  each  distribution  can  be  characterized  by  its  average 
and  standard  deviation. 

Since  the  communication  delay,  consisting  of  communication  network 
delay,  10  delay,  and  CPU  delay,  is  an  input  parameter  and  is  modeled  by 
an  abstract  probability  distribution  function,  we  made  no  assumptions 
about  the  characteristics  of  the  underlying  communication  network,  10 
devices,  and  CPU,  and  their  relative  performances.  In  fact,  communica¬ 
tion  networks  and  10  devices  with  different  performance  characteristics 
are  modeled  by  different  distribution  functions.  For  example,  a  distri¬ 
bution  function  with  small  standard  deviation  simulates  a  high  bandwidth 
or  a  lightly  loaded  system,  while  a  distribution  function  with  large 
standard  deviation  simulates  a  low  bandwidth  or  a  heavily  loaded  system. 

Besides  communication  delay,  system  and  application  parameters 
(input  parameters)  include  average  transaction  size,  multiprogramming 
level,  database  size,  ratio  of  read-only  transactions  to  update-only 
transactions,  and  access  distribution  to  the  database.  Besides  blocking 


delay,  performance  measures  (output  parameters)  include  probability  of 
conflict  and  deadlock  among  lock  requests,  average  response  times  of 
read-only  and  update  lock  requests,  and  system  through-put. 

During  the  second  step  of  the  simulation,  the  performance  measures 
obtained  during  the  first  step  will  be  used  to  avoid  simulating  the 
management  of  locks  and  timestamps.  For  example,  when'  a  tvo-phase  lock¬ 
ing  algorithm  is  simulated  in  the  second  step,  the  probability  functions 
of  conflict  and  deadlock  obtained  during  the  first  step  will  be  used  in 
conjunction  with  a  random  number  generator  to  decide  whether  a  lock 
request  must  conflict  and  deadlock.  No  locks  and  queues  of  lock 
requests  will  be  simulated  explicitly.  The  second  step  simulation  is 
being  continued  and  will  be  presented  in  a  future  report.  The  first 
step  simulation  model  is  described  in  the  rest  of  this  section. 

For  a  database  size  (DZ)  of  N,  N  locks  and  N  queues  for  the  locks 
are  simulated.  Deadlock  can  occur,  and  the  transaction  in  the  deadlock 
cycle  that  holds  the  least  number  of  locks  aborts  and  restarts  immedi¬ 
ately.  The  reason  we  choose  this  particular  transaction  to  abort  is 
that  our  previous  study  (Lin[3l)  concludes  that  this  deadlock  resolution 
algorithm  performs  best  in  all  system  and  application  environments  we 
have  simulated. 

Transaction  size  (TZ)  is  assumed  to  be  exponentially  distributed. 
The  average  of  the  distribution  varies  among  different  simulation  runs, 
but  remains  fixed  within  a  simulation  run.  A  transaction  requests  locks 
sequentially,  but  different  transactions  request  locks  asynchronously. 
A  transaction  model  in  which  read  locks  and  write  locks  respectively  are 
requested  in  parallel  is  thus  equivalent  to  our  transaction  model  with 
transaction  size  equal  to  two. 

After  being  granted  a  lock  request,  a  transaction  waits  for  a 
period  of  time  before  requesting  another  lock.  The  period  of  time  is  the 
communication  delay  discussed  previously.  The  average  of  the  communica¬ 
tion  delay  is  fixed  at  one  for  all  simulation  runs,  but  the  standard 
deviation  varies  among  different  simulation  runs.  The  simulation 
results  can  easily  be  scaled  to  whatever  the  actual  average  of  the  com- 


The  multiprogramming  level  (M?)  is  fixed  within  a  simulation  run, 
but  varies  among  different  simulation  runs;  thus  the  model  is  closed:  a 
new  transaction  is  generated  and  started  only  after  one  completes. 

The  access  distribution  to  the  database  is  random:  every  lockable 
unit  has  the  same  probability  of  being  accessed  by  a  lock  request.  He 
use  this  uniform  distribution  because  our  previous  study  (Lin[2])  shows 
that  more  concentrated  distributions  have  the  same  results  as  the  uni¬ 
form  distribution  in  heavier  load. 

Part  of  the  findings  concerning  the  relationship  between  the  R/H 
ratio  and  system  performance  of  the  first  step  simulation  is  presented 
in  the  next  section. 


4.3  Simulation  Results 

He  ran  the  simulation  program  many  times  with  different  values  of 
multiprogramming  level,  average  transaction  size,  database  size,  stan¬ 
dard  deviation  of  communication  delay,  and  R/H  ratio.  The  table  below 
shows  the  input  parameters  and  their  values  used  in  the  simulation. 

I Input  Parameter 


(transaction  size  (TZ)  i  4,16,32 


(multiprogramming  level  (MF)I  16,32,64 


Idatabase  size  (DZ)  I  4096,8192 


IR/W  ratio  I  1/3, 1/1, 3/1 


0.573,  0.75,1.87,5.28 


standard  deviation  of 
communication 


Values  Used 


Figure  4.0 


He  present  only  the  results  of  the  standard  deviation  of  communication 
delay  of  0.75,  because  we  found  that  the  relationship  between  the  R/H 
ratio  and  the  performance  of  the  two  phase  locking  does  not  change  with 
different  standard  deviation  of  communication  delay  (Lin[3]).  Tables  1 
through  9  show  the  results,  which  are  also  rearranged  and  plotted  in 
Figures  4.1  through  4.7. 


Ia  Figure  4.1  each  point  represents  the  probabilities  o£  conflict 
of  update  lock  requests  for  two  system  configurations  with  same  HP,  TZ, 
DZ,  and  DV,  but  with  different  &/W.  Different  points  of  the  figure 
represent  different  configurations  with  different  MP,  TZ,  DZ,  or  DV. 
The  X-coordinate  represents  the  probability  of  conflict  for  the  system 
with  R/W  equal  to  1/3,  and  the  Y-coordinate  the  system  with  R/W  equal 
to  3/1.  The  points  in  the  figure  lie  close  to  the  diagonal  line, 
implying  that  the  probabilities  of  conflict  of  update  requests  are  the 
same  for  each  two  configurations  with  different  R/W  ratio.  The  R/W 
ratio  has  no  effect  on  the  probability  of  conflict  of  update  lock 
requests. 

Figure  4.2  plots  a  similar  graph  for  the  probability  of  deadlock 
of  update  lock  requests,  and  the  points  also  lie  close  to  the  diagonal 
line.  Thus  the  R/W  ratio  also  has  little  effect  on  the  probability  of 
deadlock  of  update  lock  requests. 

These  two  observations  contradict  the  intuition  that  higher  R/W 
ratio  reduces  conflict  and  deadlock  for  update  transactions  because  of 
more  share  locks  and  less  exclusive  locks  in  the  system.  To  gain  more 
insight  about  this  unexpected  result,  during  each  simulation  run  we 
examined  the  locks  outstanding  in  the  system.  We  found  that  even 
though  the  ratio  of  share  locks  to  exclusive  lockB  did  increase  with 
higher  R/W  ratio,  on  the  average  the  total  number  of  locks  outstanding 
in  the  system  at  any  time  varies  little  with  the  R/W  ratio.  And  the 
total  number  of  locks  outstanding  in  the  system  determines  the  proba¬ 
bility  of  conflict  and  the  probability  of  deadlock  of  update  requests, 
because  update  requests  conflict  and  deadlock  with  both  read  and  update 
lock  requests. 

Figure  4.3  plots  a  similar  graph  for  the  average  blocking  delay  of 
blocked  update  lock  requests.  The  figure  shows  that  the  R/W  ratio  does 
have  an  effect,  though  only  a  small  one,  on  the  blocking  delay  of 
blocked  update  lock  -equests.  Specifically,  the  average  waiting 
decreases  a  little  when  the  R/W  increases  from  1/3  to  3/1.  Our  results 
(Table  .8)  show  that  with  a  higher  R/W,  read-only  transactions  complete 
slightly  faster;  thus  update  transactions  wait  slightly  shorter  for 
blocking  read-only  transactions. 


We  have  examined  the  effect  of  &/W  ratio  on  the  probability  of 
conflict  and  deadlock  of  update  lock  requests.  Here,  we  examine  how 
that  effect  translates  into  system  performance  in  terms  of  response 
time.  Table  4  shows  the  average  response  time  of  completed  update  lock 
requests.  The  average  response  time  includes  time  wasted  due  to  tran¬ 
saction  abortion.  Notice  that  if  there  is  no  blocking  delay  and  tran¬ 
saction  abortion,  then  the  average  response  time  of  update  lock 
requests  must  be  one.  The  table  shows  that  the  only  acceptable 
response  times  occur  when  the  average  transaction  size  is  four  or  the 
load  is  less  than  1Z.  The  load  is  defined  as  the  product  of  multipro¬ 
gramming  level  and  the  average  transaction  size  divided  by  the  database 
size.  The  table  shows  that  within  the  acceptable  range  of  transaction 
size  and  system  load,  the  average  response  time  of  update  lock  requests 
varies  little  with  the  R/W  ratio.  This  is  expected  because  of  the 
invariance  of  the  probability  of  conflict,  probability  of  deadlock,  and 
blocking  delay  of  update  requests  with  respect  to  R/W  ratio. 

We  next  examine  the  effect  of  the  R/W  ratio  on  read-only  lock 
requests.  Similar  to  Figures  4.1  and  4.2,  Figure  4.4  and  Figure  4.5 
plot  the  probability  of  conflict  and  the  probability  of  deadlock  of 
read-only  lock  requests  respectively.  These  figures  show  that  higher 
R/W  ratios  reduce  significantly  both  the  probability  of  conflict  and 
the  probability  of  deadlock  of  read-only  lock  requests. 

Similar  to  Figure  4.3,  Figure  4.6  plots  the  average  blocking  delay 
of  blocked  read-only  lock  requests,  and  the  figure  shows  that  the  R/W 
ratio  has  little  effect  on  the  blocking  delay  of  read-only  requests. 
This  occurs  because  blocked  read  requests  wait  only  for  update  transac¬ 
tions,  and  we  have  shown  previously  that  the  R/W  ratio  has  little 
effect  on  the  response  time  of  update  transactions. 

We  have  shown  that  a  higher  R/W  ratio  reduces  the  probability  of 
conflict  and  deadlock  of  read-only  lock  requests,  but  it  has  no  effect 
on  their  blocking  delay.  What  does  this  mean  in  terms  of  the  average 
response  time  of  read-only  lock  requests?  Table  8  shows  the  average 
response  time  of  read-only  requests,  indicating  that  acceptable 
response  times  occur  when  the  average  transaction  size  is  4  or  the  sys¬ 
tem  load  is  less  than  1Z.  Within  this  range  of  transaction  size  and 


system  load,  when  the  R/W  increases  from  1/3  to  3/1,  the  average 
response  time  of  read-only  requests  decreases  only  slightly. 


This  result  is  surprising  because  ve  have  previously  observed  that 
the  probability  of  conflict  and  deadlock  of  read-only  requests 
decreases  significantly  when  the  R/W  ratio  increases  from  1/3  to  3/1. 
But  because  the  probability  of  conflict  and  deadlock  is  very  small  to 
begin  with  when  the  transaction  size  and  system  load  are  within  accept¬ 
able  range,  the  reduction  in  the  probability  of  conflict  and  deadlock 
does  not  improve  significantly  the  response  time  of  read-only  requests. 

Ve  next  examine  the  relationship  between  the  R/W  ratio  and  system 
through-put.  The  results  are  shown  in  Table  9.  The  table  shows  that, 
within  the  acceptable  range  of  transaction  size  and  system  load,  the 
system  through-put  does  not  increase  significantly,  when  the  R/W  ratio 
increases  from  1/3  to  3/1, 

To  gain  more  insight,  we  did  some  time  series  analysis  and  found 
that  regardless  of  the  R/W  ratio  in  the  incoming  transaction  stream, 
the  system  is  eventually  saturated  with  mostly  update  transactions. 
Figure  4.7  shows  the  time  series  of  numbers  of  update  transactions 
active  in  a  system  with  R/W  ratio  equal  to  3/1  and  multiprogramming 
level  equal  to  64.  The  figure  shows  that  the  number  of  update  transac¬ 
tions  active  in  the  system  is  stablized  at  about  95%  of  the  multipro¬ 
gramming  level,  even  though  75%  of  incoming  transactions  are  read-only 
transactions.  This  explains  why  the  system  does  not  perform  much 
better  when  the  R/W  ratio  increases  from  1/3  to  3/1,  because  the  system 
is  clogged  up  with  update  transactions  that  complete  slowly. 

You  might  have  noticed  that  the  database  sizes  used  are  relatively 
small  compared  to  actual  databases.  But  we  have  pointed  out  that  the 
results  apply  to  systems  with  short  transactions  and  moderate  loads; 
therefore  the  results  apply  to  systems  with  larger  database  size, 
because  the  larger  the  DZ,  the  lighter  the  load. 


4.4  Summary 

He  simulated  two  phase  locking  in  various  system  and  application 
environments.  We  found  that  &/W  has  little  or  no  effect  on  the  proba¬ 
bility  of  conflict  and  deadlock  and  the  blocking  delay  of  update  lock 
requests.  In  addition,  the  R/W  ratio  has  little  effect  on  the  response 
times  of  update  lock  requests. 

We  also  found  the  R/W  ratio  has  little  effect  on  the  blocking 
delay  of  read-only  transactions.  However,  we  found  that  the  R/W  ratio 
has  significant  effect  on  the  probabilities  of  conflict  and  deadlock  of 
read-only  transactions.  Increase  in  R/W  ratio  significantly  reduces  the 
percentage  of  probabilities  of  conflict  and  deadlock  of  read-only  tran¬ 
sactions.  But  if  the  average  transaction  sise  is  small  or  the  system 
load  is  light,  then  this  reduction  in  the  probability  of  conflict  and 
deadlock  reduces  only  slightly  the  response  time  of  read-only  lock 
requests.  And  the  overall  system  through-put  is  little  effected  by  the 
R/W  ratio. 
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Basic  Timestamp,  Multiple  Version  Timestamp, 
and  Two  Phase  Locking 


Abstract 


Using  simulation,  ve  compare  the  performance  of  the  basic  times¬ 
tamp,  the  multiple-version  timestamp,  and  the  two  phase  locking  con¬ 
currency  control  protocols.  Ve  find  that  in  every  system  configuration 
ve  have  simulated  the  multiple  version  timestamp  protocol  performs  only 
marginally  better  than  the  basic  timestamp  protocol.  In  addition,  ve 
find  that  vhen  the  average  transaction  size  is  small,  both  timestamp 
protocols  outperform  the  tvo  phase  locking  protocol.  But  vhen  the  aver¬ 
age  transaction  size  is  large,  the  tvo  phase  locking  protocol  outper¬ 
forms  both  timestamp  protocols. 


5.1  Introduction 


Many  distributed  concurrency  control  algorithms  have  been  proposed 
(Bad[l] ,  Berll],  Bllll],  Gar[l] ,  Linl3],  Rosll],  Stell],  Stoll], 
Tho[ll).  But  hov  veil  do  they  perform? 

A  few  researchers  have  attempted  to  compare  performance  of  dif¬ 
ferent  algorithms  (Gar[l]f  Lin[l],  LN[1],  LN[2],  LN[3],  Munll],  Rie[l], 
Tha[l]),  and  only  one  of  them  studied  the  performance  of  timestamp  pro¬ 
tocols  (Lin[ll). 

These  performance  studies  are  very  difficult  to  compare,  and  it  is 
almost  impossible  to  integrate  their  results.  They  compare  different 
algorithms,  and  they  make  different  assumptions  about  system  and  appli¬ 
cation  environments  and  employ  different  measures  for  system  perfor¬ 
mance.  Therefore  ve  began  a  major  project  that  compared  the  principal 
distinct  distributed  concurrency  control  and  reliability  algorithms, 
using  the  same  model,  assumptions,  performance  (output)  parameters,  and 
system  and  application  (input)  parameters.  Some  results  of  the  project 
concerning  the  two  phase  locking  have  been  reported  in  Lin [2],  LNll], 
LN[2],  and  LN[3].  This  paper  reports  some  of  the  results  of  this  pro¬ 
ject  that  concern  timestamp  protocols. 

In  particular,  this  paper  reports  our  findings  about  the  perfor¬ 
mance  of  the  basic  timestamp  and  the  multiple  version  timestamp  proto¬ 
cols  (Ber[l]),  and  about  the  comparison  of  their  performance  to  the  per¬ 
formance  of  the  two  phase  locking  protocol.  We  found  that,  contrary  to 
our  intuition,  the  multiple  version  timestamp  protocol  did  not  signifi¬ 
cantly  increase  the  throughput  of  read-only  transactions  over  the  basic 
timestamp  protocol;  neither  did  it  improve  the  throughput  of  update 
transactions.  We  also  found  that  both  timestamp  protocols  performed 
much  better  than  the  two  phase  locking  protocol  when  the  average  tran¬ 
saction  size  vas  small.  But  vhen  the  average  transaction  size  was 
large,  the  two  phase  locking  protocol  outperformed  both  timestamp  proto¬ 
cols. 


This  paper  is  organized  as  follows.  Section  5.2  describes  the 
overall  simulation  model.  Section  5.3  describes  the  specifics  of  the 
bas  ic  timestamp  and  the  multiple  version  timestamp  models.  In  addition, 
the  simulation  results  for  these  two  models  are  discussed.  Section  5.4 


discusses  the  simulation  results  of  a  modified  model  in  which  the  ratio 
of  read-only  transactions  to  update  transactions  is  fixed  inside  the 
system,  instead  of  in  the  incoming  transaction  stream.  Section  5.5  com¬ 
pares  the  two  phase  locking  with  the  basic  timestamp  protocol,  and  Sec¬ 
tion  5.6  concludes  the  study.  Section  5.7  contains  the  references. 


5*2  The  Simulation  Model 


Performance  of  a  concurrency  control  algorithm  in  a  DBMS  depends  on 
■any  aapecta  of  the  entire  system.  These  include  system  characteristics 
such  as  multiprogramming  level  (number  of  transactions  running  con¬ 
currently),  database  size  (number  of  data  granules)  and  granularity  with 
which  data  can  be  locked  or  accessed,  communication  network,  10  devices, 
memory  sise,  CPD  speed,  number  of  nodes  in  the  system,  and  distribution 
of  the  data  among  these  nodes.  Performance  is  also  affected  by  the 
nature  of  the  application  —  the  transactions  executed  to  read  or  update 
the  database.  Transaction  characteristics  include  transaction  size 
(number  of  data  granules  requested  by  each  transaction),  the  frequency 
of  local  and  remote  requests  for  data,  access  distribution  of  the  data¬ 
base  (probability  of  each  data  granule  being  accessed  by  a  data 
request),  and  whether  the  transactions  only  read  data  or  update  the 
database.  Thus,  to  accurately  evaluate  the  performance  of  a  concurrency 
control  algorithm,  we  must  include  all  these  factors  in  the  simulation 
model,  and  this  is  too  expensive  to  do  directly. 

To  simplify  the  simulation,  we  model  the  system  and  transactions  in 
a  highly  functional  model.  Much  of  the  detail  of  a  real  distributed 
system  is  captured  in  a  few  parameters  that  are  used  as  inputs  to  the 
simulation  model.  This  approach  permits  us  to  greatly  reduce  the  number 
of  simulation  runs  necessary  and  the  complexity  of  the  model,  while 
retaining  most  of  the  impact  that  these  details  have  on  the  performance 
of  the  concurrency  control  algorithms. 

He  model  a  transaction  as  a  sequence  of  data  requests,  each 
requesting  a  data  granule.  The  size  of  the  data  granule  is  irrelevant 
in  our  model.  Between  two  consecutive  data  requests,  a  transaction 
incurs  a  delay  called  processing  delay.  The  processing  delay  consists 
of  communication  network  delay,  10  delay,  and  CPU  processing  delay. 
Communication  network,  10  devices,  and  CPUs  are  not  simulated  in  detail 
in  our  model.  Instead  we  use  the  processing  delay  as  an  input  parameter 
to  our  simulation  model.  For  each  simulation  run,  we  assume  the  pro¬ 
cessing  delay  to  have  a  certain  probability  distribution,  but  we  vary 
the  probability  distribution  for  different  simulation  runs.  He  use  only 
hypoexponential  and  hyper exponential  distributions;  therefore  each  dis¬ 
tribution  can  be  characterized  by  its  average  and  standard  deviation. 


Since  the  processing  delay  (consisting  of  communication  network 
delay,  10  delay,  and  CPU  delay),  is  an  input  parameter  and  is  modeled  by 
an  abstract  probability  distribution  function,  we  make  no  assumptions 
about  the  characteristics  of  the  underlying  communication  network,  10 
devices,  and  CPUs,  and  their  relative  performances.  In  fact,  communica¬ 
tion  networks  and  10  devices  that  have  different  performance  charac¬ 
teristics  are  modeled  by  different  distribution  functions.  For  example, 
a  distribution  function  that  has  small  standard  deviation  models  a  high 
bandwidth  or  a  lightly  loaded  system,  while  a  distribution  function  that 
has  large  standard  deviation  models  a  low  bandwidth  or  a  heavily  loaded 
system. 

Besides  processing  delay,  input  parameters  of  the  simulation  model 
include  average  transaction  size  (TZ),  multiprogramming  level  (MP), 
database  size  (DZ),  ratio  of  read-only  transactions  to  update-only  tran¬ 
sactions  (K./W)  entering  the  system,  and  access  distribution  to  the  data¬ 
base. 


We  did  not  explicitly  include  the  frequency  of  local  and  remote 
data  requests  as  an  input  parameter  because  it  is  captured  by  the  proba¬ 
bility  distribution  of  the  processing  delay.  For  example,  a  system  that 
executes  mostly  local  data  requests  can  be  modelled  by  a  distribution 
that  has  a  small  mean  value.  Neither  did  we  include  the  data  granular¬ 
ity  as  an  input  parameter,  because  the  data  granularity  is  a  function  of 
the  database  size  and  the  transaction  size.  Increasing  the  granularity 
is  equivalent  to  decreasing  the  database  size  and  the  transaction  size. 
Moreover,  we  simulated  only  random  access  to  the  database.  Every  data 
granule  has  the  same  probability  of  being  accessed  by  a  data  request. 
We  used  this  uniform  distribution  because  our  previous  study  (LN[ll) 
showed  that  more  concentrated  distributions  had  the  same  results  as  the 
uniform  distribution  in  heavier  load.  The  details  of  these  input  param¬ 
eters  follow. 

We  simulated  two  kinds  of  transactions,  read-only  transactions  and 
update  transactions,  and  the  &/W  ratio  determines  their  ratio  in  the 
incoming  transaction  stream  fed  to  the  simulation  system. 

Transaction  size  is  assumed  to  be  exponentially  distributed  with 
mean  TZ.  The  mean  TZ  varies  among  different  simulation  runs,  but 


remains  fixed  vitbin  a  simulation  run.  We  model  a  read-only  transaction 
as  a  sequence  of  read  requests,  and  an  update  transaction  as  a  sequence 
of  read  requests  (with  intention  to  update  later),  followed  by  parallel 
update  requests  (each  requesting  only  one  data  granule).  Thus  in  an 
update  request,  we  require  that  each  data  granule  be  read  before  it  is 
updated.  We  assume  that  an  update  transaction  has  two  phases:  a  read 
phase  and  a  write  phase.  During  the  read  phase,  an  update  transaction 
issues  a  sequence  of  read  requests,  and  during  the  write  phase  all 
updates  are  committed  in  parallel  into  the  databases.  This  model  is 
general  enough  to  include  all  transaction  types  of  interest.  For  exam¬ 
ple,  to  model  a  pure  update  transaction  (one  that  does  not  read  from  the 
database)  there  will  be  only  the  write  phase.  To  model  transactions  in 
which  read  requests  are  issued  in  parallel  only  once,  the  read  phase 
will  issue  only  one  read  request. 

After  being  granted  a  data  granule,  a  transaction  waits  for  a 
period  of  time  before  requesting  another  granule.  This  period  of  time 
is  the  processing  delay  discussed  previously.  The  average  of  the  pro¬ 
cessing  delay  is  fixed  at  one  for  all  simulation  runs,  but  the  standard 
deviation  varies  among  different  simulation  runs.  The  simulation 
results  can  easily  be  scaled  to  whatever  the  actual  average  of  the  pro¬ 
cessing  delay  may  be. 

The  multiprogramming  level  (HP)  is  fixed  within  a  simulation  run, 
but  it  varies  among  different  simulation  runs;  thus  the  model  is  closed 
and  a  new  transaction  is  generated  and  started  only  after  one  completes. 

Performance  measures  (output  parameters)  of  the  simulation  model 
include  probability  of  restart,  system  throughput,  and  others  that  will 
be  mentioned  when  specific  models  are  discussed. 

Part  of  the  results  of  the  performance  of  the  basic  timestamp  and 
multiple  version  timestamp  protocols  are  presented  in  the  following  sec- 


5.3  Basic  Timestamp  vs  Multiple-Version  Timestamp 

In  this  section,  we  first  describe  the  specifics  of  the  basic 
timestamp  and  the  multiple  version  timestamp  models,  and  then  ve  discuss 
the  simulation  results.  Much  of  the  detail  of  the  protocol  can  also  be 
found  in  Berll].  Ve  describe  the  basic  timestamp  model  first. 

Ve  assign  a  unique  timestamp  (drawn  from  the  system  clock)  to  each 
transaction  when  we  initiate  the  transaction.  Ve  keep  a  read  timestamp 
and  a  write  timestamp  with  each  data  granule  of  the  database.  The  read 
timestamp  and  the  write  timestamp  record  the  timestamps  of  the  last 
transactions,  reading  and  writing  respectively  the  data  granule. 

To  synchronize  an  update  transaction,  during  the  read  phase  the 
timestamp  of  the  update  transaction  is  compared  with  the  read  and  write 
timestamps  of  each  data  granule  read.  If  the  timestamp  of  the  update 
transaction  is  smaller  than  the  read  timestamp,  the  update  transaction 
is  restarted  immediately  to  avoid  aborting  it  later  when  it  tries  to 
commit  (since  we  never  abort  read-only  transactions).  If  the  timestamp 
of  the  update  transaction  is  smaller  than  the  write  timestamp,  then  the 
update  transaction  is  also  restarted  because  it  tries  to  read  the  data 
granule  after  a  transaction  that  has  a  greater  timestamp  haB  updated  the 
data  granule.  If  the  timestamp  of  the  update  transaction  is  larger  than 
both  read  and  write  timestamps  of  the  data  granule,  then  it  replaces  the 
read  timestamp  of  the  data  granule  and  the  update  transaction  continues. 
During  the  write  phase,  the  timestamp  of  the  update  transaction  is  again 
compared  to  the  read  and  write  timestamps  of  each  data  granule  updated. 
If  the  timestamp  of  the  update  transaction  is  smaller  than  the  read 
timestamp,  the  update  transaction  is  again  restarted.  But  if  the  times¬ 
tamp  of  the  update  transaction  is  smaller  than  the  write  timestamp,  the 
write  operation  is  ignored.  If  the  timestamp  of  the  update  transaction 
is  larger  than  both  the  read  and  write  timestamps  of  the  data  granule, 
the  write  timestamp  of  the  data  granule  is  replaced  by  the  timestamp  of 
the  update  transaction.  If  T(t)  represents  the  timestamp  of  transaction 
t,  and  R(x)  and  V(x)  the  read  timestamp  and  write  timestamp  of  data 
granule  x,  the  protocol  can  be  summarized  as  follows.  During  the  read 
phase  of  an  update  transaction  t, 
for  each  x  read  by  t, 
if  T(t)<R(x)  — >  restart  t; 


if  T(t)<W(x)  — >  restart  t; 

if  T(t)>R(x)  &  T(t)>W(x)  — >  replace  R  by  T,  read  proceeds 


And  during  the  write  phase  of  an  update  transaction  t, 
if  T(t)<&(x)  for  any  x  updated  by  t  — >  restart  t; 
else  for  each  x  updated  by  t, 
if  T(t)<W(x)  —>  update  to  x  is  ignored, 
if  T(t)>W(x)  — >  replace  W(x)  by  T(t),  commit  the  update. 

To  process  a  read  request  from  a  read-only  transaction,  we  compare 
its  timestamp  with  the  write  timestamp  of  the  data  granule.  If  the 
write  timestamp  of  the  data  granule  is  larger,  then  the  read-only  tran¬ 
saction  is  restarted;  otherwise  the  read-only  transaction  continues,  and 
the  read  timestamp  of  the  data  granule  is  replaced  by  the  timestamp  of 
the  read  request  if  the  latter  timestamp  is  greater  than  the  former.  In 
summary, 

T<W  — >  restart 

T>W  — >  read-only  proceeds,  and  replace  R  by  T  if  R<T. 

Performance  measures  of  the  model  include  system  throughput  (number 
of  requests  completed  per  time  unit)  and  the  probability  of  restart  for 
both  read  requests  of  read-only  transactions  and  read  requests  of  update 
transactions  during  the  read  phase.  Since  in  the  case  of  timestamping 
protocols,  an  update  transaction  may  progress  to  the  write  phase  and 
then  conflict  and  abort,  we  also  include  the  probability  of  restart  of 
transactions  (not  data  requests)  during  the  write  phase. 

The  multiple  version  timestamp  model  is  very  similar  to  the  basic 
timestamp  model.  System  and  application  parameters,  and  conflicts 
between  data  requests  and  data  timestamps,  were  dealt  with  in  the  same 
way.  However,  in  the  multiple  version  model,  we  kept  four  read  and  four 
write  timestamps  for  each  data  granule;  the  first  one  is  the  smallest 
and  the  fourth  one  the  largest.  However,  because  we  did  not  simulate 
computation  within  each  transaction,  we  did  not  keep  the  data  values 
corresponding  to  the  four  write  timestamps  for  each  data  granule.  A 
read-only  transaction  can  access  earlier  versions  of  the  data  if  the 
timestamp  of  the  read-only  transaction  is  smaller  than  the  largest  write 
timestamp  of  the  data  granule  to  be  accessed.  But  because  we  require  an 
update  transaction  to  read  first  what  it  writes,  an  update  transaction 


can  only  read  the  latest  version;  if  the  R/H  ratio  is  zero,  this  model 
degenerates  to  the  basic  timestamp  model.  For  this  reason,  we  did  not 
simulate  any  system  configuration  with  R/W  equal  to  0. 

Since  the  probability  of  restart  for  read-only  transactions  is 
already  small  in  the  single  version  basic  timestamping  protocol,  and 
since  the  number  of  versions  does  not  affect  the  probability  of  restart 
for  update  transactions,  we  decided  not  to  vary  the  number  of  versions. 
He  compare  the  simulation  results  of  both  the  basic  and  the  multiple- 
version  timestamp  protocols  in  the  following. 

He  first  examine  the  probability  that  a  read  request  (both  read¬ 
only  request  and  request  of  update  transaction  during  the  read  phase) 
will  conflict,  resulting  in  the  restart  of  its  transaction.  Figure  5.1 
and  Figure  5.2  respectively  show  these  probabilities  for  the  basic  and 
the  multiple-version  timestamp  protocols.  He  note  that  because  read¬ 
only  transactions  never  restarted  in  the  multiple-version  timestamp 
model.  Figure  5.2  contains  only  data  for  update  transactions  during  the 
read  phase.  He  note  also  that,  for  some  of  the  heavy  load  cases,  the 
system  thrashed  and  never  stabilized;  therefore  the  data  are  not  reli¬ 
able.  However,  they  do  qualitatively  indicate  what  is  happening.  Com¬ 
paring  these  two  figures,  we  find  very  little  difference  between  the 
basic  timestamp  and  the  multiple-version  timestamp  protocols  in  the  pro¬ 
bability  of  restart  during  the  read  phase. 

He  next  examine  the  probability  of  restart  of  update  transactions 
during  the  write  phase.  Figure  5.3  and  Figure  5.4  show  the  results  for 
the  basic  timestamp  and  the  multiple-version  timestamp  protocols,  and 
the  difference  between  the  two  figures  is  very  small. 

For  the  basic  and  the  multiple-version  timestamp  protocols.  Figure 
5.5  and  Figure  5.6  show  the  system  throughput,  which  is  the  number  of 
completed  (excluding  those  aborted)  data  requests  per  time  unit.  Notice 
that  the  average  processing  delay  is  always  one  and  that  there  are 
always  MP  transactions  running  in  the  system;  therefore  if  there  is  no 
transaction  abortion,  the  throughput  must  equal  MP,  which  is  the  maximum 
possible 'throughput.  Combined  read-only  and  update  throughputs  for  sys¬ 
tem  configurations  that  have  average  transaction  size  equal  to  4  are 
within  102  of  maximum  possible.  But  combined  throughputs  of  system  con- 
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figurations  that  have  average  transaction  aise  (TZ)  larger  than  16  are 
leas  than  30Z  of  the  maximal  throughput. 

These  two  figures  show  system  thrashing  when  the  average  transac¬ 
tion  size  is  large  or  the  system  load  is  heavy.  If  the  system  is  in 
equilibrium,  write  throughput  should  be  very  nearly  1/3  of  the  read 
throughput,  since  incoming  transactions  occur  in  that  ratio.  However, 
this  is  not  true  for  TZ-32,  or  for  TZ-16,  HP-32  and  64.  In  these  cases, 
the  system  thrashed  and  was  jammed  with  long  update  transactions  that 
never  finished.  These  observations  show  that  both  timestamp  protocols 
perform  extremely  poorly  during  long  transactions  or  while  bearing  heavy 
loads . 

When  we  compare  Figure  5.5  with  Figure  5.6,  we  find  little  differ¬ 
ence  between  these  two  protocols  in  throughput  except  when  the  transac¬ 
tion  size  (TZ)  or  the  system  load  (TZxMP/DZ)  is  large,  in  which  case  the 
throughputs  are  extremely  low  and  the  statistics  are  not  reliable  any¬ 
way. 

From  the  observations  of  this  section,  we  can  conclude  that  both 
protocols  perform  poorly  when  the  average  transaction  size  is  large  or 
when  the  system  load  is  very  heavy.  In  addition,  there  is  no  signifi¬ 
cant  difference  in  performance  between  the  basic  timestamp  and  the  mul¬ 
tiple  version  timestamp  protocols.  More  versions  of  data  do  not  improve 
significantly  the  throughput  of  read-only  transactions.  Vben  the  load 
is  light,  the  probability  of  conflict  for  read-only  transactions  is  very 
small,  therefore  more  versions  of  data  do  not  increase  the  read-only 
transaction  throughput.  When  the  load  is  heavy,  the  system  is  jammed 
with  long  update  transactions  that  never  finish,  thus  locking  out  read¬ 
only  transactions;  therefore  more  versions  of  data  do  not  help  either. 

One  may  argue  that  if  we  do  not  allow  the  system  to  be  saturated 
with  long  update  transactions,  then  the  multiple-version  timestamp  pro¬ 
tocol  should  perform  better  than  the  basic  timestamp  protocol.  We  will 
test  this  argument  in  the  next  section. 


5.4  Results  of  a  Modified  Model 


In  the  last  section,  we  concluded  that  there  is  no  significant 
difference  between  basic  timestamp  and  multiple-version  timestamp  proto¬ 
cols  in  performance,  including  the  throughput  of  read-only  transactions. 
One  may  argue  that  this  conclusion  is  not  valid  because  the  simulation 
model  should  not  have  allowed  update  transactions  to  jam  the  system, 
thus  locking  out  read-only  transactions. 

To  test  this  argument,  we  impose  the  R/W  ratio  limitation  inside 
the  system,  instead  of  in  the  incoming  transaction  stream:  that  is,  the 
ratio  of  the  number  of  running  read-only  transactions  to  the  number  of 
running  update  transactions  is  always  fixed  at  R/W.  All  other  parame¬ 
ters  of  the  model  remain  unchanged.  The  results  are  shown  in  Figures 
5.7  and  5.8  for  the  basic  and  multiple-version  timestamp  protocols 
respectively.  We  include  in  the  figures  data  from  the  previous  model 
for  comparison.  These  data  are  marked  by  *. 

Comparing  the  data  of  the  modified  model  to  the  data  of  the  previ¬ 
ous  model,  we  find  that  by  fixing  the  &/W  ratio  inside  the  system, 
instead  of  in  the  incoming  transaction  stream,  the  throughputs  of  read¬ 
only  transactions  increase  tremendously  when  the  average  transaction 
size  (TZ)  is  large.  The  reason  is  that  when  the  R/W  ratio  is  fixed 
inside  the  system,  the  system  can  no  longer  be  saturated  with  long 
update  transactions  that  never  finish.  But  when  the  average  transaction 
size  is  small,  fixing  the  R/W  ratio  inside  the  system  does  not  increase 
significantly  the  throughputs  of  read-only  transactions.  The  reason  is 
that  the  system  is  never  saturated  with  long  update  transactions  in  the 
first  place. 

When  we  compare  Figure  5.7  with  Figure  5.8,  we  find  no  significant 
difference  between  the  performance  of  the  basic  timestamp  protocol  and 
the  multiple-version  timestamp  protocol.  This  contradicts  the  earlier 
argument  that  if  the  R/W  is  fixed  inside  the  system  instead  of  in  the 
incoming  transaction  stream,  the  >altiple-version  timestamp  protocol 
should  have  higher  read-only  transaction  throughputs  than  the  basic 
timestamp  protocol. 


The  reason  for  this  surprising  result  is  that  both  timestamp  proto¬ 
cols  favor  read-only  transactions.  Whenever  there  is  a  conflict  between 


an  active  read-only  transaction  and  an  active  update  transaction,  both 
protocols  abort  the  update  transaction.  In  both  protocols,  an  active 
read-only  transaction  is  aborted  only  if  it  conflicts  with  a  completed 
update  transaction  that  has  a  later  timestamp,  and  this  occurs  rarely 
because  update  transactions  take  much  longer  to  complete.  Since  read¬ 
only  transactions  rarely  get  aborted  in  the  basic  timestamp  protocol, 
more  versions  of  data  make  little  difference  in  read-only  transaction 
throughput . 


5.5  Timestamp  Vs  Locking 


In  this  section  ve  compare  the  performance  of  the  basic  timestamp 
protocol  vith  the  performance  of  the  two  phase  locking  protocol. 

The  simulation  model  for  the  two  phase  locking  (LN[1],  LN[2], 
LN[3])  is  similar  to  the  timestamp  model  except  that  the  two  phase  lock¬ 
ing  is  substituted  for  the  basic  timestamp  protocol.  We  show  part  of 
the  simulation  results,  specifically  the  throughput,  in  Figure  5.9.  The 
unit  of  the  throughputs  is  the  number  of  data  requests  completed  per 
time  unit,  excluding  requests  aborted. 

Comparing  Figure  5.9  with  Figure  5.5,  we  find  that  when  the  average 
transaction  size  (TZ)  is  small,  the  basic  timestamp  protocol  outperforms 
the  two  phase  locking  protocol.  But  when  the  average  transaction  size 
is  relatively  large  (TZ  larger  than  16)  the  two  phase  locking  outper¬ 
forms  the  basic  timestamp  protocol. 

To  learn  why  the  timestamp  protocol  outperforms  the  two  phase  lock¬ 
ing  when  the  average  transaction  size  is  small,  we  examined  our  previous 
simulation  results  on  the  two  phase  locking  protocol  ([Lin2],  [NL1], 
[NL2] ) .  We  found  that,  in  the  two  phase  locking  protocol,  blocked  tran¬ 
sactions  tend  to  wait  for  long  transactions,  even  when  the  average  tran¬ 
saction  size  is  small.  Sine  long  transactions  take  long  periods  of 
time  to  complete,  blocked  transactions  tend  to  wait  for  long  periods  of 
time.  On  the  other  hand,  Figure  5.1  shows  that  when  the  average  tran¬ 
saction  6ize  is  small,  the  probability  of  the  basic  timestamp  protocol 
restarting  a  transaction  is  very  small.  Therefore  we  conclude  that  when 
the  average  transaction  size  is  small,  restarting  transactions  in  the 
basic  timestamp  protocol  is  better  than  blocking  transactions  in  the  two 
phase  locking  protocol.  But  the  reverse  is  true  when  the  average  tran¬ 
saction  size  is  large,  because  in  the  timestamp  method  thrashing  is  a 
serious  problem:  many  transactions  are  constantly  aborted  and  never  fin¬ 
ish. 


We  must  caution  that  this  result  must  be  taken  in  the  context  of 
our  simulation  model  assumption.  In  our  model,  we  do  not  simulate 
queueing  for  CPU,  10  devices,  and  communication  lines.  Queueing  for 
these  devices  is  captured  in  a  single  model  parameter,  the  processing 
delay,  which  has  an  erlangian  distribution.  To  validate  the  conclusions 


5.6  Conclusions 


We  come  to  three  major  conclusions  concerning  the  performance  of 
timestamp  concurrency  control  method. 

First,  over  a  wide  range  of  system  conditions,  the  multiple  version 
timestamp  method  performs  only  marginally  better  than  the  basic  times¬ 
tamp  method.  When  the  average  transaction  size  (TZ)  is  small,  read-only 
transactions  complete  quickly  and  rarely  conflict  with  younger  update 
transactions  that  have  completed;  therefore  more  versions  of  data  help 
only  marginally.  When  the  average  transaction  size  is  large,  the  system 
is  jammed  with  update  transactions,  and  few  new  read-only  transactions 
can  start;  thus  more  versions  of  data  do  not  improve  the  throughput  of 
read-only  transactions  either.  When  we  fixed  the  R/W  ratio  inside  the 
system  to  prevent  the  system  from  being  saturated  with  update  transac¬ 
tions,  the  multiple  version  timestamp  protocol  still  performs  only  mar¬ 
ginally  better  than  the  basic  timestamp  protocol,  because  read-only 
transactions  complete  quickly  and  rarely  conflict  with  younger  transac¬ 
tions  that  have  completed. 

The  second  conclusion  is  that  when  the  average  transaction  size 
(TZ)  is  small,  the  basic  timestamp  protocol  outperforms  two  phase  lock¬ 
ing  protocol.  But  when  the  average  transaction  size  is  relatively 
larger,  the  two  phase  locking  protocol  outperforms  the  basic  timestamp 
protocol. 

The  third  conclusion  is  that  when  the  average  transaction  size  is 
small,  fixing  the  ratio  of  read-only  transaction  to  update  transactions 
inside  the  system  does  not  improve  system  performance.  But  when  the 
average  transaction  size  is  relatively  large,  fixing  the  R/W  ratio 
inside  the  system  significantly  improves  the  throughput  of  the  system, 
because  this  prevents  the  system  from  being  saturated  by  long  update 
transactions.  This  amounts  to  giving  read-only  transactions  higher 
priority  to  enter  the  system;  since  read-only  transactions  complete  fas¬ 
ter,  they  also  enter  faster. 

But  we  caution  that  these  conclusions  be  taken  in  the  context  of 
the  simulation  model  assumptions.  Currently  we  are  altering  some  of  the 
assumptions  to  see  whether  these  conclusions  remain  true,  and  prelim¬ 
inary  results  seem  to  indicate  that  they  are. 
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SECTION  VI 

Performance  of  Distributed  Concurrency  Control* 

Wente  K.  Lin 
Jerry  Nolte 


*  A  version  of  this  paper  appeared  in  the  Distributed  Database  System 
Designers  Handbook,  prepared  for  the  DDB  Control  and  Allocation  Pro¬ 
ject. 


6.  Performance  of  Distributed  Concurrency  Control 

6.1  Introduction 

Many  factors  effect  the  performance  of  a  distributed  concurrency 
algorithm: 

1.  10  delay, 

2.  communication  delay, 

3.  ratio  of  read-only  to  write  transactions, 

4.  database  size,  transaction  size, 

5.  system  multiprogramming  level, 

6.  distribution  and  replication  of  the  database, 

7.  overhead  of  deadlock  detection, 

8.  and  system  load,  defined  a6  the  product  of  transaction  size  and  mul¬ 
tiprogramming  level  divided  by  the  database  size. 

Our  simulation  study  of  the  performance  of  distributed  concurrency  con¬ 
trol  algorithms  shows  that  four  of  these  factors  have  more  significant 
impact  than  the  others:  10  delay,  communication  delay,  transaction  size, 
and  system  load.  Hence  we  divide  our  simulation  results  into  groups  and 
discuss  them  separately  by  classifying  the  system  environment  as  either 
10-bound  or  communication  bound .  and  as  either  short  transaction  loaded 
or  lone  transaction  loaded.  We  consider  a  system  to  be  10  bound  if 
queueing  for  10  or  CPU  resources  is  a  more  significant  problem  than 
queuing  for  communication  channel;  and  we  consider  a  system  to  be  com¬ 
munication  bound  if  queuing  for  communication  channel  is  a  more  signifi¬ 
cant  problem  than  queuing  for  10  and  CPU  resources.  We  consider  a  sys¬ 
tem  to  be  short  transaction  loaded  if  the  average  number  of  data  items 
requested  by  the  transactions  (or  transaction  size)  is  less  than  Q.05Z 
of  the  database.  The  system  is  long  transaction  loaded  if  the  average 
is  larger  than  0.2Z  of  the  database.  If  the  average  is  between  0.05Z 
and  0.2Z  of  the  database,  the  classification  of  the  system  as  short 
transaction  loaded  or  long  transaction  loaded  depends  on  the  system 
load.  Details  of  the  classification  can  be  found  in  Figure  6.1. 

Thus  we  present  four  categories  of  system  environments:  short  transac¬ 
tion  loaded  and  10  bound  (SIO),  short  transaction  loaded  and  communica¬ 
tion  bound  (SCM),  long  transaction  loaded  and  10  bound  (L10),  and  long 
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Figure  6.1  SyBtem  Classification 

(Short  Loaded  or  Long  Loaded) 

transaction  loaded  and  communication  bound  (LCM).  For  each  of  these 
four  environments,  ve  compare  the  performance  of  various  concurrency 
control  algorithms,  taking  into  consideration  the  factors  that  are  not 
used  to  classify  the  system  environment  —  i.e.  multiprogramming  level, 
ratio  of  read-only  to  write  transactions,  distribution  and  replication 
of  the  database. 

Ve  first  describe,  in  Section  6.2,  the  distributed  DBMS  model  that 
we  use  to  evaluate  these  algorithms.  We  then  define  and  describe,  in 
Section  6.3,  the  concurrency  control  algorithms  that  we  evaluate.  We 
compare  these  algorithms  in  Section  6.4.1  through  6.4.4  for  each  of  the 
four  environments.  In  Section  6.5  we  summarize  the  results  of  Section 
6.  Details  of  the  simulation  results  can  be  found  in  the  Appendix. 

To  use  this  section  as  a  design  guide,  a  system  designer  must  first 
classify  his  system  environment,  using  the  following  three  parameters. 
First,  he  must  decide  whether  his  system  environment  is  10  bound  or  com¬ 
munication  bound.  Second,  he  must  estimate  the  average  number  of  data 
items,  as  a  percentage  of  the  total  number  of  data  items  in  the  data¬ 
base,  requested  by  a  transaction  (transaction  size).  Third,  he  must 
estimate  the  average  system  load,  which  is  the  product  of  the  transac¬ 
tion  size  and  the  multiprogramming  level  of  the  system  (number  of  tran¬ 
sactions  running  concurrently).  Using  these  three  parameters  and  Figure 
6.1,  the-designer  can  find  his  system  classification.  For  each  classif¬ 
ication,  he  can  find  the  comparison  of  various  distributed  concurrency 
control  algorithms  in  Section  6.4.1  through  Section  6.4.4. 


6.2  Performance  Model 


We  assume  that  there  are  tvo  kinds  of  transactions:  read-only  tran¬ 
sactions  and  write  transactions  (update  transactions).  Write  transac¬ 
tions  always  read  what  they  write,  and  write  what  they  read.  This 
assumption  may  seem  restrictive,  but  it  is  a  good  approximation  of  real 
applications.  Our  earlier  simulation  results  [L1N81&]  showed  that  the 
total  number  of  requests  and  the  ratio  of  read-only  requests  to  write 
requests  active  at  any  moment  in  the  system  have  much  greater  impact  on 
the  system  performance  than  the  ratio  of  read-only  to  write  transac¬ 
tions.  Moreover  our  analysis  shows  that  a  more  general  assumption  of 
transactions  would  not  favor  any  concurrency  control  algorithm;  thus  for 
performance  comparison  of  the  algorithms,  this  assumption  would  not  dis¬ 
tort  the  results.  To  use  the  results  of  this  section  to  evaluate  the 
performance  of  a  system  that  has  transactions  reading  more  than  writing, 
the  ratio  of  read-only  to  write  transactions  in  the  system  can  be 
adjusted  upward. 

A  read-only  transaction  consists  of  a  sequence  of  read-only 
requests,  and  each  request  reads  a  data  item.  A  write  transaction  con¬ 
sists  of  a  sequence  of  write  requests  (update  requests),  followed  by  a 
two-phase  commit.  Requests  from  a  transaction  are  processed  sequen¬ 
tially;  another  request  is  initiated  only  after  the  previous  one  has 
been  successfully  processed. 

As  previously  described,  a  distributed  DBMS  consists  of  TMs, 
schedulers,  and  DMs.  Each  transaction  is  managed  by  a  TM,  which 
sequences  its  requests  and  sends  them  to  the  appropriate  scheduler  to  be 
processed.  If  the  scheduler  site  is  different  from  the  TM  site,  a  com¬ 
munication  delay  is  incurred. 

If  a  request  is  read-only,  the  scheduler  requests  a  read  lock  for 
the  requested  data  item  (assuming  that  a  two  phase  locking  algorithm  is 
used).  Depending  on  the  particular  concurrency  control  algorithm  used, 
some  lock  managers  may  grant  the  lock  without  checking  whether  the 
request  conflicts  with  another  transaction.  Other  lock  managers  may 
check  for  the  conflict.  If  a  conflict  is  found,  the  read-only  request 
waits  and  incurs  a  blocking  delay.  Depending  on  the  concurrency  control 
algorithm  used,  the  scheduler  may  initiate  a  deadlock  detection  when 


blocking  occurs,  thus  incurring  processing  and  possibly  communication 
overhead.  When  the  lock  for  the  requested  data  item  is  obtained,  the 
scheduler  sends  the  read-only  request  to  the  appropriate  DM,  and  the 
read-only  request  incurs  a  processing  delay.  A  read-only  transaction 
ends  after  all  its  requests  have  been  successfully  processed. 

A  write  request  is  processed  in  a  manner  similar  to  a  read  request, 
except  that  successful  processing  of  all  write  requests  of  a  transaction 
is  always  followed  by  a  two-phase  commit,  and  a  write  transaction  ends 
after  the  two-phase  commit  is  successfully  processed  (two-phase  commit 
is  the  only  reliability  algorithm  that  we  use  in  our  simulation  of  con¬ 
currency  control  algorithm). 

If  timestamp  based  algorithms  are  used,  a  timestamp  is  assigned  to 
each  transaction,  and  requests  from  the  transaction  inherit  the  transac¬ 
tion  timestamp.  Each  data  item  also  has  read  and  write  timestamps  that 
record  the  timestamps  of  the  transactions  that  laBt  read  from  (or  write 
into)  the  data  item.  For  all  the  timestamp  algorithms  that  we  have 
evaluated,  the  scheduler  always  resides  at  the  site  of  a  DM,  and  a 
request  is  always  sent  to  the  scheduler  at  the  site  where  the  data  is  to 
be  accessed.  When  a  scheduler  receives  a  request,  it  compares  the 
timestamp  of  the  request  with  the  read  and  write  timestamp(s)  of  the 
data  item,  and  it  may  or  may  not  delay  the  request,  depending  on  the 
particular  algorithm  used.  If  the  request  is  not  blocked,  it  is  sent  to 
the  DM  at  the  scheduler  site,  and  the  request  incurs  a  processing  delay. 

We  simulate  both  10  bound  and  communication  bound  system  environ¬ 
ments.  In  the  10  bound  environment,  we  explicitly  simulate  queuing  for 
local  processing,  which  combines  cpu  and  10  processing.  We  differen¬ 
tiate  between  local  processing  of  simple  messages,  such  as  lock  request, 
lock  release,  and  deadlock  detection,  and  local  processing  of  data 
requests.  The  latter  needs  more  processing  time  than  the  former.  In 
the  10  bound  environment,  we  do  not  simulate  queuing  for  communication 
channels.  Communication  delay  is  simply  simulated  by  a  delay  drawn  from 
a  probabilistic  distribution. 

In  the  communication  bound  environment,  we  explicitly  simulate 
queuing  for  communication  channels,  but  not  for  local  processing 
resources.  In  some,  cases,  we  differentiate  between  message  and  data 


transmission.  The  latter  takes  longer  than  the  former.  We  simulate 
local  delay  (combining  10  and  cpu  processing)  by  drawing  a  random  number 
from  a  probabilistic  distribution. 

The  performance  parameters  that  we  use  to  compare  distributed  con¬ 
currency  control  algorithms  include  read  throughout .  write  throughout . 
average  read  response  time,  and  average  write  response  time.  Read 
throughput  is  the  number  of  read-only  requests  successfully  completed 
per  time  unit;  read-only  requests  processed  and  subsequently  aborted  are 
not  included.  The  write  throughput  is  similarly  defined.  Read  response 
time  is  measured  from  the  time  a  read-only  request  is  initiated  by  a  TM 
to  the  time  when  the  next  read-only  request  of  the  same  transaction  is 
initiated  by  the  same  TM.  Thus,  it  may  include  communication  delay, 
blocking  delay,  and  processing  delay.  Average  read  response  time  aver¬ 
ages  over  the  response  times  of  all  successfully  completed  read-only 
requests.  Average  write  response  time  is  similarly  computed. 

In  addition  to  blocking  delay,  communication  delay,  and  processing 
delay,  other  factors  also  affect  average  response  times  and  throughputs 
(e.g.,  transaction  abortion,  deadlock  detection,  and  multiple  versions 
of  data).  The  concurrency  control  algorithms  evaluated  in  this  section 
can  be  differentiated  by  the  way  they  trade  off  these  factors.  Some 
algorithms  trade  longer  blocking  delay  for  fewer  transaction  abortions, 
and  others  trade  reversely.  Some  trade  more  communication  delay  for 
less  blocking  delay,  and  others  trade  reversely.  We  describe  these 
algorithms  in  the  next  section.  In  Section  6.4,  based  on  the  total 
throughput,  we  compare  and  rank  these  algorithms.  Detailed  data  of  the 
performance  parameters  can  be  found  in  the  Appendix. 

6.3  Description  of  Algorithms 

The  algorithms  that  we  will  consider  are  listed  below.  Selection 
of  these  algorithms  is  based  on  our  earlier  heuristic  evaluation 
reported  in  [BERN81a].  The  selected  algorithms  were  shown  to  perform 
better  than  the  algorithms  discarded.  Names  of  some  algorithms  are 
linked  by  the  conjunctive  "and"  (e.g.  Primary  Site  and  Primary  Site). 
The  term  before  the  conjunctive  describes  the  method  used  for  read 
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requests,  and  the  term  after  the  conjunctive  describes  e  method  used 
for  write  requests.  These  algorithms  are  described  briefly  in  this  sec¬ 
tion  and  summarized  in  Figure  6.2.  Details  of  these  algorithms  can  be 
found  in  the  references. 

1.  Primary  Site  and  Primary  Site  Tvo  Phase  Locking  (C-C) 

2.  Primary  Copy  and  Primary  Copy  Tvo  Phase  Locking  (P-P) 

3.  Basic  and  Basic  Tvo  Phase  Locking  (B-B) 

4.  Basic  and  Primary  Copy  Tvo  Phase  Locking  (B-P) 

5.  Basic  and  Primary  Site  Tvo  Phase  Locking  (B-C) 

6.  DDM  Multiple  Version  and  Optimistic  Tvo  Phase  Locking  (DDM) 

7.  Basic  and  Optimistic  Tvo  Phase  Locking  (Opm) 

8.  Majority  Consensus  Timestamp  (Maj) 

9*  Wait-Die  Tvo  Phase  Locking  (Die) 

10.  Basic  Timestamp  (BaT) 

11.  Multiple  Version  Timestamp  (MvT) 

12.  Dynamic  Timestamp  (Dyn) 

The  SDD-1  algorithm  is  not  explicitly  covered  because  the  Dynamic 
Timestamp  algorithm  is  an  improved  version  of  it  ([LIN79,  [LIN81]). 
Neither  is  the  Conservative  Timestamp  algorithm  covered,  because  this 
algorithm  essentially  executes  transactions  serially  in  timestamp  order. 
Thus  it  can  perform  better  than  other  algorithms  only  when  the  transac¬ 
tion  size  is  very  large  and  the  system  load  is  extremely  heavy  and  con¬ 
current  execution  of  transactions  becomes  counterproductive. 

The  Primary  Site  and  Primary  Site  method  is  essentially  a  central¬ 
ized  tvo-phase  locking  method.  All  requests  for  read  locks  and  write 
locks  are  sent  to  and  processed  by  a  designated  primary  site,  vhich  may 
use  backup  sites  to  improve  resiliency.  This  method  trades  fever  tran¬ 
saction  abortions  for  more  transaction  blocking,  and  it  checks  for  lock 
conflict  as  early  as  possible.  It  detects  deadlock  as  early  as  possi¬ 
ble,  and  it  avoids  distributed  deadlock  detection;  but  it  has  a 
bottleneck  at  the  primary  site. 

The  Primary  Cony  and  Primary  Copy  method  is  a  generalized  version 
of  the  Primary  Site  and  Primary  Site  method.  All  requests  for  read 
locks  and  write  locks  are  sent  to  and  processed  by  a  designated  primary 
copy  site.  However,  primary  copy  sites  for  different  data  items  may  be 


different,  thus  distributed  deadlock  may  occur.  This  method  also  trades 
fever  transaction  abortions  for  more  transaction  blocking,  and  it  checks 
lock  conflict  as  early  as  possible.  It  requires  distributed  deadlock 
detection,  but  it  may  delay  deadlock  detection  to  reduce  communication 
overhead. 

The  Basic  and  Basic  method  sets  read  locks  and  reads  data  locally 
if  a  local  copy  is  available;  otherwise  it  locks  and  reads  the  closest 
copy.  It  sets  write  locks  globally.  For  each  update  request,  an  update 
lock  is  requested  from  all  copies,  and  the  update  request  is  granted 
only  after  locks  from  all  copies  are  obtained.  This  method  trades  fas¬ 
ter  read-only  transaction  response  time  for  slower  write  transaction 
response  time.  It  also  trades  more  transaction  blocking  for  fewer  tran¬ 
saction  abortions.  It  checks  for  lock  conflict  and  deadlock  a6  early  as 
possible,  and  at  the  expense  of  more  communication  overhead. 

The  Basic  and  Primary  Copy  method  processes  read  requests  as  the 
previous  method  does,  but  it  requests  write  locks  only  from  a  designated 
primary  copy.  This  method  checks  for  most  lock  conflict  as  soon  as  pos¬ 
sible,  but  it  may  delay  distributed  deadlock  detection  to  reduce  commun¬ 
ication  overhead.  This  method  also  trades  fever  transaction  abortions 
for  more  transaction  blocking. 

The  Basic  and  Primary  Site  method  is  similar  to  the  last  method 
except  that  update  lock  requests  are  sent  to  a  central  6ite  instead  of 
to  several  primary  copy  sites.  Thus  deadlock  detection  is  more  central¬ 
ized  than  in  the  previous  method,  and  overhead  i6  more  centralized  at 
the  primary  site. 

The  PPM  [CHAN82a,  CHAN82b]  method  avoids  conflict  between  read 
requests  and  update  requests  by  keeping  several  versions  of  each  data 
item.  For  each  update  request,  PPM  locks  locally  (if  a  local  copy 
exists,  or  locks  the  closest  copy).  The  update  lock  is  propagated  to 
other  copies  at  transaction  end.  Petection  of  most  conflicts  among 
update  requests  is  delayed  uncil  transaction  end.  Thus  blocking  delay 
is  minimized  for  most  write  transactions  at  the  expense  of  more  transac¬ 
tion  abortions  at  transaction  end. 


The  Basic  end  Optimistic  method  sets  read  and  update  locks  locally, 
if  a  local  copy  exists;  otherwise  it  locks  the  closest  copy.  The  update 
lock  is  propagated  to  all  copies  when  the  transaction  that  holds  the 
update  lock  ends.  Thus,  distributed  lock  conflict  checking  and  deadlock 
detection  is  delayed  until  a  transaction  ends.  This  algorithm  reduces 
transaction  blocking  delay  at  the  expense  of  more  transaction  abortions. 

The  Majority  Consensus  algorithm  is  similar  to  the  Basic  Optimistic 
algorithm.  Each  transaction  has  two  phases:  a  read  phase  and  a  commit 
phase.  During  the  read  phase,  a  transaction  reads  locally  if  a  local 
copy  exists;  otherwise  it  reads  the  closest  copy.  Timestamps  of  data 
items  read  by  a  transactions  are  recorded.  During  the  commit  phase, 
both  read-only  and  update  transactions  must  be  certified  by  comparing 
the  timestamps  of  the  data  read  by  each  transaction  to  the  transaction 
timestamp.  Because  of  the  certification  step,  read-only  transactions 
require  more  communication  overhead  in  this  algorithm  than  in  the  Basic 
Optimistic  algorithm.  The  details  of  the  algorithm  can  be  found  in 
[BERN81a,THOM79] .  If  the  algorithm  is  modified  to  favor  read-only  tran¬ 
sactions  so  that  read-only  transactions  need  no  certification,  then  it 
requires  no  more  communication  overhead  than  the  Basic  Optimistic  algo¬ 
rithm.  This  algorithm  checks  for  lock  conflicts  as  late  as  possible, 
and  it  trades  less  transaction  blocking  for  more  transaction  abortions. 

In  the  Wait-Die  algorithm,  a  unique  sequence  number  is  attached  to 
every  transaction.  A  transaction  always  locks  locally  if  a  local  copy 
is  available;  otherwise  it  locks  the  closest  copy.  The  locks  are  pro¬ 
pagated  to  other  copies  when  the  transaction  commits.  Whenever  a  tran¬ 
saction  is  blocked  by  another  transaction,  the  algorithm  compares  the 
sequence  numbers  of  the  two  transactions.  If  the  blocked  transaction 
has  a  lower  priority  sequence  number,  it  waits,  otherwise  it  aborts. 
This  algorithm  checks  local  lock  conflict  as  soon  as  possible,  but  it 
checks  distributed  conflict  at  transaction  end.  It  has  no  transaction 
deadlock  (at  the  expense  of  more  transaction  abortions). 

In  the  Basic  Timestamp  method,  a  read  and  a  write  timestamp  are 
attached  ~to  each  data  item  of  the  database.  Each  transaction  that  reads 
or  vpdates  the  data  item  updates  its  read  or  write  timestamp.  Conflict 
is  detected  by  comparing  the  timestamp  of  the  transaction  that  reads  or 


writes  a  data  item  with  the  timestamps  of  the  data  item,  and  not  by  com¬ 
paring  the  timestamps  of  two  transactions  as  done  by  the  Wait-Die  algo¬ 
rithm.  This  algorithm  is  similar  to  the  Wait-Die  algorithm  because  it 
also  avoids  transaction  deadlock.  Unlike  the  Wait-Die  algorithm,  it  has 
no  blocking  delay  and  possibly  has  more  transaction  abortions.  This 
algorithm  may  have  fever  transaction  abortion  than  the  Wait-Die  algo¬ 
rithm  when  most  transactions  are  read-only,  because  it  allows  two  tran¬ 
sactions  (a  read-only  and  a  write)  to  access  the  same  data  item  simul¬ 
taneously. 

The  Multiple  Version  Timestamp  algorithm  i6  a  generalization  of  the 
previous  algorithm.  It  keeps  several  versions  of  each  data  item  in 
order  to  reduce  conflict  between  read-only  transactions  and  update  tran¬ 
sactions.  Thus,  this  method  trades  more  overhead  of  maintaining  multi¬ 
ple  data  versions  for  fever  transaction  abortions. 

The  Dynamic  Timestamp  algorithm  [LIN79,  LIN81 ]  is  an  improved  ver¬ 
sion  of  SDD-1  algorithm;  it  is  unique  among  all  the  algorithms  that  we 
will  compare  for  the  following  reasons.  It  requires  transaction  times¬ 
tamps  but  not  data  item  timestamps.  It  does  not  avoid  transaction 
blocking,  thus  it  trades  more  transaction  blocking  for  fever  transaction 
abortions.  But  it  uses  preanalysis  of  transactions  to  reduce  unneces¬ 
sary  transaction  blocking.  This  algorithm  may  require  a  lot  of  communi¬ 
cation  overhead  when  many  null  write  messages  are  needed  [BERN82,  LIN79, 
LIN81],  and  its  performance  may  depend  on  system  load  [LIN81].  Thus  it 
may  perform  poorly  in  some  system  environments. 

The  principal  characteristics  of  these  algorithms  are  summarized  in 
Figure  6.2. 
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:  tranaaction  blocking  ia  preferred. 
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distributed  conflict  is  checked  at  transaction  end. 
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Figure  6.2  Summary  of  Concurrency  Control  Algorithms 


6.4  Performance  Evaluation 


6.4.1  Short  Transaction  Loaded  &  10  Bound 

In  this  section  ve  compare  the  performance  of  distributed  con¬ 
currency  control  algorithms  in  a  system  environment  in  which  most  tran¬ 
sactions  are  relatively  short  and  10  resource  is  the  performance 
bottleneck.  The  comparison  of  these  algorithms  is  summarized  in  Figure 
6.3.  The  comparison  is  based  on  actual  simulation  results  except  for 
the  Wait-Die,  Majority  Consensus  Timestamp,  and  Dynamic  Timestamp  algo¬ 
rithms.  The  evaluation  of  the  Wait-Die  algorithm  is  based  on  its  simi¬ 
larity  to  the  Basic  Timestamp  algorithm;  the  evaluation  of  the  Dynamic 
Timestamp  algorithm  is  based  on  the  results  of  [LIN81];  and  the  evalua¬ 
tion  of  the  Majority  Consensus  Timestamp  algorithm  is  based  on  its  simi¬ 
larity  with  the  Basic  Optimistic  algorithm. 

Figure  6.3  shows  that  five  algorithms  perform  better  than  others: 
the  Ba  Timestamp,  Multiple  Version  Timestamp,  DDM,  Optimistic,  and 
Wait-Die  algorithms. 

In  the  short  transaction  loaded  and  10  bound  environment,  we  found 
that  transaction  abortion  is  a  better  strategy  than  transaction  blocking 
(i.e.  it  is  better  to  abort  than  to  wait).  The  abortion  strategy  is 
used  by  the  Basic  Timestamp  and  Multiple  Version  Timestamp  algorithms, 
and  to  a  large  degree  by  the  Wait-Die  algorithm.  We  also  found  that  it 
is  better  to  delay  lock  conflict  detection  than  to  detect  lock  conflict 
early.  Both  the  DDM  and  the  Basic  Optimistic  algorithms  use  the  delay 
strategy . 

Although  the  DDM  algorithm  uses  locking  for  write  transactions,  and 
the  Optimistic  algorithm  uses  locking  for  both  read  and  write  transac¬ 
tions,  blocking  occurs  only  among  local  transactions  that  access  data 
from  the  same  site.  Transactions  running  at  different  sites  never  block 
each  other.  Write  locks  are  propagated  to  other  sites  at  transaction 
end,  then  conflicts  among  transactions  running  at  different  sites  are 
detected -and  always  result  in  transaction  abortions.  Therefore  perfor¬ 
mance  of  these  two  algorithms  is  closer  to  those  of  timestamp  algorithms 
than  to  those  of  two-phase  locking  algorithms.  However,  notice  that  the 
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DDM  and  Basic  Optimistic  algorithms  always  abort  transactions  at  tran¬ 
saction  end,  while  the  timestamp  algorithms  may  abort  transactions  at  an 
earlier  phase  of  their  execution. 

These  five  algorithms  perform  equally  well  in  most  cases.  The 
timestamp  algorithms  perform  better  than  the  DDM  and  Basic  Optimistic 
algorithms  when  the  database  is  fully  redundant  (thus  read-only  transac¬ 
tions  complete  quickly),  the  R/W  ratio  is  high  (probability  of  conflict 
among  data  requests  is  small),  and  local  delay  is  large  (local  blocking 
delay  is  large  and  abortion  at  transaction  end  is  expensive).  However 
when  the  database  is  less  redundant,  the  DDM  and  Basic  Optimistic  algo¬ 
rithms  perform  slightly  better  than  the  timestamp  algorithms.  Both 
read-only  and  write  transactions  require  some  remote  data  accesses  and 
take  longer  to  complete,  and  this  causes  the  probability  of  conflict 
among  transactions  to  rise  and  the  timestamp  algorithms  to  abort  more 
transactions. 

The  Basic  Timestamp  algorithm  performs  as  well  as  the  Multiple  Ver¬ 
sion  Timestamp  algorithm,  and  the  latter  requires  more  overhead  and 
storage  space  for  keeping  multiple  versions  of  data  [LINN83].  Therefore 
the  Basic  Timestamp  algorithm  is  preferable  to  the  Multiple  Version 
Timestamp  algorithm,  unless  the  multiple  versions  of  data  are  required 
in  any  case  for  database  recovery  and  resiliency.  Similarly,  the 
difference  in  performance  between  the  DDM  and  Basic  Optimistic  algo¬ 
rithms  is  very  small,  and  the  former  needs  higher  overhead  and  more 
storage  space  for  keeping  multiple  versions  of  data.  The  Basic  Optimis¬ 
tic  algorithm  is  preferable,  unless  the  versions  cf  data  are  required  in 
any  case  for  database  recovery  and  resiliency. 

The  Wait-Die  algorithm  performs  slightly  worse  than  the  Basic 
Timestamp  algorithm  when  most  transactions  are  read-only.  When  a  read¬ 
only  transaction  conflicts  with  a  write  transaction,  the  timestamp  algo¬ 
rithms  never  abort  the  read-only  transaction,  and  they  abort  the  write 
transaction  only  when  a  nonserializable  execution  may  occur.  However 
when  most  transactions  are  write  transactions,  the  Wait-Die  algorithm  is 
preferred  because  it  performs  as  well  as  the  Basic  Timestamp  method  and 
it  needs  no  data  item  timestamps,  which  require  storage  space  and  pro¬ 
cessing  overhead. 
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The  Dynamic  Timestamp  algorithm  performs  best  when  most  transac¬ 
tions  are  read-only,  communication  is  fast,  database  is  almost  fully 
redundant,  and  preanalysis  can  be  done  on  most  transactions.  In  this 
environment,  the  fast  protocols,  Rl,  Rla,  Rlab,  and  Rib  [LIM79],  LIN82] 
apply  to  most  transactions.  Assuming  system  conditions  remain  the  same 
except  that  the  database  is  not  redundant,  the  Dynamic  Timestamp  algo¬ 
rithm  still  performs  relatively  veil,  because  more  efficient  protocols 
(R2,  R2a,  R2ab,  and  R2b)  apply  to  most  transactions.  These  protocols 
are  not  as  efficient  as  the  group  of  Rl  protocols,  but  they  are  rela¬ 
tively  fast  compared  vith  R3  protocol.  In  all  other  cases,  either  vhen 
the  communication  is  slow  or  vhen  most  transactions  update  the  database, 
the  Dynamic  Timestamp  algorithm  is  not  efficient. 

The  Majority  Consensus  algorithm  performs  reasonably  veil,  but  not 
as  veil  as  the  Basic  Optimistic  algorithm.  The  Majority  Consensus  algo¬ 
rithm  as  proposed  in  [THOM79]  requires  extra  communication  overhead  for 
read-only  transactions.  If  the  algorithm  is  modified  to  favor  read-only 
transactions,  so  that  read-only  transactions  need  not  be  certified,  then 
it  vould  perform  as  veil  as  the  Basic  Optimistic  algorithm. 

To  summarize,  in  this  environment  transaction  abortion  is  a  better 
strategy  than  transaction  blocking,  and  delayed  lock  conflict  checking 
is  a  better  strategy  than  early  lock  conflict  checking. 


6.4.2  Short  Transactions  &  Communication  Bound 

In  this  section  ve  compare  the  performance  of  distributed  con¬ 
currency  control  algorithms  in  a  system  environment  in  vhich  most  tran¬ 
sactions  are  relatively  short  and  communication  channel  is  the  perfor¬ 
mance  bottleneck.  The  comparison  of  the  algorithms  is  summarized  in 
Figure  6.4.  The  comparison  is  based  on  actual  simulation  results  except 
for  the  Wait-Die,  Majority  Consensus,  and  the  Dynamic  Timestamp  algo¬ 
rithms.  The  evaluation  of  the  Wait-Die  algorithm  is  based  on  its  simi¬ 
larity  to  the  Basic  Timestamp  algorithm;  the  evaluation  of  the  Dynamic 
Timestamp  algorithm  is  based  on  the  results  of  [LIN81];  and  the  evalua¬ 
tion  of  the  Majority  Consensus  algorithm  is  based  on  its  similarity  to 
the  Basic  Optimistic  algorithm. 
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Figure  6.3  Performance  Comparison:  Short 
Transaction  Loaded  &  £0  Bound 

Figure  6.4  shows  that  seven  algorithms  perform  better  than  the  oth¬ 
ers:  Basic-Primary  Copy,  Baaic  Timestamp,  Multiple  Version  Timestamp, 
DDM,  Basic  Optimistic,  Wait-Die,  and  Dynamic  Timestamp. 

We  found  that  transaction  abortion,  aimilar  to  the  SIO  environment, 
is  a  better  strategy  than  transaction  blocking,  and  that  delayed  lock 
conflict  detection  is  a  better  strategy  than  early  detection.  However, 
because  of  the  communication  channel  bottleneck,  performance  of  the 
algorithms  that  require  extra  communication  messages  degrade  in  some 
cases. 

The  Basic  Timestamp  and  Multiple  Version  Timestamp  algorithms  per¬ 
form  best  in  all  cases.  However,  when  the  database  is  fully  redundant, 
the  DDM  and  Basic  Optimistic  algorithms  perform  just  as  well.  Read-only 
transactions  never  incur  communication  delays,  and  write  transactions 
incur  communication  delays  only  during  the  commit  phase.  Therefore 
transactions  finish  fast,  blocking  delay  is  shorter,  and  abortion  at 
transaction  end  is  less  expensive. 

The  Majority  Consensus  algorithm,  as  proposed  in  [THOM79] ,  does  not 
perform  -well  because  of  the  extra  communication  messages  required  for 
read-only  transactions.  If  the  algorithm  is  modified  to  favor  read-only 
transactions,  so  that  read-only  transactions  need  not  be  certified,  the 


Algorithm  would  perform  as  well  as  the  Basic  Optimistic- algorithm. 

The  Wait-Die  algorithm  performs  just  as  veil  ae  the  timestamp  algo¬ 
rithms  in  most  cases.  However,  when  most  transactions  are  read-only, 
the  Wait-Die  algorithm  unnecessarily  aborts  more  read-only  transactions 
than  the  timestamp  algorithms,  thus  performing  worse  than  the  timestamp 
algorithms. 

The  DDM  algorithm  performs  as  well  as  the  timestamp  algorithms  when 
the  database  is  fully  redundant.  However,  when  the  database  is  less 
redundant  and  most  transactions  are  read-only,  its  performance  degrades 
as  shown  in  Figure  6.4.  When  the  database  is  not  fully  redundant, 
read-only  transactions  require  one  extra  communication  message,  which 
causes  a  long  delay  in  a  communication  bound  enviromsent . 

The  Basic-Primary  Copy  algorithm  performs  10Z  to  20Z  worse  than  the 
best  algorithms  in  all  cases,  because  it  incurs  extra  communication  mes¬ 
sages  when  obtaining  locks  from  the  primary  copies,  and  it  uses  transac¬ 
tion  blocking  instead  of  transaction  abortion.  The  Dynamic  Timestamp 
algorithm  performs  best  when  most  transaction  are  read-only  and  can  be 
preanalyzed.  In  this  environment,  the  most  efficient  protocols  can  be 
used  and  communication  overhead  for  null-write  messages  is  minimised. 

Since  the  Basic  Timestamp  algorithm  performs  as  well  as  the  Multi¬ 
ple  Version  Timestamp  algorithm,  the  former  is  preferable  unless  the 
multiple  versions  of  data  are  required  in  any  case  for  database  recovery 
and  resiliency.  Similar  observations  apply  to  the  DDM  and  Basic 
Optimistic  algorithms  [L1NN83]. 

Our  conclusion  is  that  in  this  environment  abortion  is  better  than 
blocking,  and  that  delayed  lock  conflict  checking  is  better  than  early 
lock  conflict  checking.  However,  some  algorithms  that  use  these  two 
strategies  may  not  perform  well  in  some  cases  because  they  require  extra 
communication  messages. 


Rank  1  is  best  and  Rank  6  is  worst. 

Rank  numbers  have  no  absolute  meaning.  They  only  show  relative 
performance  across  a  row,  not  a  column. 

R/W:  Ratio  of  Read-only  transactions  to  Write  transactions 
L/C:  Ratio  of  Local  delay  to  Communication  delay,  excluding 
queuing  delay 

Red:  Redundancy  of  the  database 
*  :  Does  not  matter 

Figure  6.4  Performance  Comparison:  Short  Transaction  Loaded 
&  Communication  Bound 


6.4.3  Long  Transaction  Loaded  &  10  Bound 

In  this  section  ve  compare  the  performance  of  distributed  con¬ 
currency  control  algorithms  in  a  system  environment  in  which  most  tran¬ 
sactions  are  relatively  long  and  10  resource  is  the  bottleneck.  The 
comparison  is  sumaarised  in  Figure  6.5.  The  comparison  is  based  on 
actual  simulation  results  except  for  the  Wait-Die  and  Majority  Consensus 
algorithms.  The  evaluation  of  the  Wait-Die  algorithm  is  based  on  its 
similarity  to  the  Basic  Timestamp  algorithm;  and  the  evaluation  of  the 
Majority  Consensus  algorithm  is  based  on  its  similarity  to  the  Basic 
Optimistic  algorithm. 

Figure  6.5  shows  that  three  algorithms  perform  better  than  the  oth¬ 
ers:  Basic  Primary.  DDM.  and  Basic-Optimistic. 

In  this  environment  (long  transactions,  heavy  system  load)  transac¬ 
tions  conflict  with  each  other  more  often,  but  only  a  fraction  of  the 
conflicts  lead  to  transaction  deadlocks.  Thus,  transaction  blocking  is 
better  than  indiscriminate  transaction  abortion.  Moreover,  prompt  lock 
conf lict -detection  is  better  chan  procrastination.  Lock  conflicts  that 
are  detected  at  transaction  end  always  lead  to  deadlocks.  The  Basic 
Primary.  DDM,  and  Basic  Optimistic  algorithms  use  the  blocking  strategy. 
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The  Basic  Primary  algorithm  uses  the  early  lock  conflict  detection  stra¬ 
tegy. 

The  Basic  Priaary  Copy  algorithn  performs  best  in  this  environment 
because  it  does  not  abort  a  transaction  unless  it  deadlocks,  and  it 
detects  lock  conflicts  as  soon  as  they  occur.  However,  when  most  tran¬ 
sactions  are  read-only,  and  the  database  is  not  fully  redundant,  the 
Basic  Priaary  Copy  does  not  perform  as  well  as  the  DDM  and  Basic- 
Optimistic  algorithms,  because  the  extra  communication  messages  required 
by  the  Basic  Primary  Copy  algorithm  for  write-locks  and  deadlock  detec¬ 
tions  does  not  outweigh  the  extra  transaction  abortions  by  the  DDM  and 
Basic-Optimistic  algorithm* 

The  DDM  and  the  Basic  Optimistic  algorithms  perform  well  in  par¬ 
tially  redundant  databases,  because  more  lock  conflicts  are  detected 
during  the  reading  phase  of  transactions  and  less  transactions  abort  at 
the  comit  phase.  However,  when  the  database  is  fully  redundant,  most 
lock  conflicts  are  detected  during  the  commit  phase,  which  always  leads 
to  deadlocks  and  transaction  abortions,  thus  resulting  in  the  poorer 
performance  of  these  two  algorithms  in  thiB  conditions. 

The  timestamp  algorithms  do  not  perform  as  well  as  the  Basic- 
Primary  method  because  transaction  blocking  is  better  than  transaction 
abortion.  However,  the  tiaiestamp  algorithms  perform  better  than  the  DDM 
and  Basic-Optimistic  algorithms,  when  the  database  is  fully  redundant. 
Head-only  transactions  incur  no  communication  delay  and  complete 
quickly;  the  read-phase  of  write  transactions  also  completes  quickly. 
Thus  conflict  between  read-only  transactions  and  write  transactions  that 
result  in  the  abortion  of  write  transactions  is  reduced.  In  addition, 
when  the  database  is  fully  redundant,  the  timestamp  algorithms  detect 
more  conflicts  at  the  read-phase,  thus  aborting  more  transactions  at 
earlier  stages  of  processing,  while  the  DDM  and  Basic-Optimistic  algo¬ 
rithms  detect  more  conflicts  at  the  commit  phase,  thus  aborting  more 
transactions  at  their  ends.  However,  when  the  database  is  not  fully 
redundant,  the  DDM  and  Basic-Optimistic  algorithms  detect  more  conflicts 
at  the  rdlsd-phase,  and  they  abort  more  transactions  at  the  early  stages 
of  processing,  thus  performing  better  than  the  timestamp  algorithms. 


The  Wait-Die  algorithm  performs  aa  well  aa  the  Basic  Timestamp 
algorithm,  except  when  most  transactions  are  read-only*  Then  the  Basic 
Timestamp  algorithm  has  higher  throughput  of  read-only  transactions  than 
the  Wait-Die  algorithm. 

The  Majority  Consensus  algorithm  also  performs  poorly  because  it 
delays  lock  conflict  detection  until  transaction  end,  thus  resulting  in 
many  late  transaction  abortions.  In  fact,  all  certifier  algorithms  that 
certify  transactions  at  transaction  end  perform  badly  in  the  long  tran¬ 
saction  environment .  The  Primary  Site  &  Primary  Site  (C-C)  and  the  Pri¬ 
mary  Copy  &  Primary  Copy  (P-P)  algorithms  also  perform  relatively  well 
when  the  database  is  fully  redundant.  These  two  algorithms  abort  fewer 
transactions  than  the  Basic  Timestamp,  Multiple  Version  Timestamp,  DDM, 
and  Basic  Optimistic  algorithms,  and  the  savings  in  transaction  abor¬ 
tions  more  than  make  up  for  the  extra  communication  messages  required  by 
the  two  algorithms.  The  Basic-Basic  algorithm  does  not  perform  aa  well 
because  it  requires  many  more  communication  messages  than  other  algo¬ 
rithms. 

To  summarize,  in  this  environment  transaction  blocking  is  better 
than  transaction  abortion,  and  early  lock  conflict  detection  is  better 
than  late  detection. 
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Rank  1  ia  best  and  Rank  6  is  worst. 

Rank  numbers  have  no  absolute  meaning.  They  only  show  relative 
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Figure  6.5  Performance  Comparison:  Long 
Transaction  Loaded  &  10  Bound 


6.4.4  Long  Tranaactions  &  Communication  Bound 

In  thia  section,  we  compare  the  performance  of  distributed  con¬ 
currency  control  algorithms  in  a  system  environment  in  which  most  tran¬ 
sactions  are  long  and  communication  channel  is  the  bottleneck.  The  com¬ 
parison  of  these  algorithms  is  summarized  in  Figure  6.6.  The  comparison 
is  based  on  actual  simulation  results  except  for  the  Wait-Die  and  Major¬ 
ity  Consensus  algorithms.  The  evaluation  of  the  Wait-Die  algorithm  is 
based  on  its  similarity  to  the  Basic  Timestamp  algorithm;  and  the 
evaluation  of  the  Majority  Consensus  algorithm  is  based  on  its  similar¬ 
ity  to  the  Basic  Optimistic  algorithm. 

Figure  6.6  shows  that  six  algorithms  perform  better  than  the  oth¬ 
ers:  Basic  Timestamp,  Multiple  Version  Timestamp,  DDM,  Basic  Optimistic, 
Basic  Primary,  and  Wait-Die. 

In  this  system  environment  (long  transactions,  heavy  system  load, 
and  long  communication  delay)  transactions  conflict  with  each  other  more 
often,  feTt  only  a  fraction  of  the  conflicts  lead  to  deadlocks;  thus, 
transaction  blocking  is  better  than  indiscriminate  transaction  abortion. 
Moreover,  early  lock  conflict  detection  is  better  than  procrastination. 


Lock  conflicts  detected  at  transection  end  alvaya  lead  to  deadlocks. 
The  Basic  Primary,  DM,  Basic  Optimistic,  and  to  certain  degree  the 
Wait-Die  algor it has  use  the  blocking  strategy;  and  the  Basic  Primary  and 
Wait-Die  algor ithas  detect  lock  conflicts  as  early  as  possible.  In 
addition,  because  of  long  coaaunication  delay,  algorithms  requiring 
extra  communication  messages  may  not  perform  veil  even  if  they  use  tran¬ 
saction  blocking  instead  of  transaction  abortion.  The  DDM  and  the  Basic 
Primary  algorithms  require  extra  cosvunication  messages  in  some  cases. 

The  Basic  Primary  Copy  algorithm  performs  the  best  when  the  data¬ 
base  is  not  fully  redundant  because  it  requires  no  more  communication 
messages  than  the  other  algorithms,  and  because  it  causes  fever  unneces¬ 
sary  transaction  abortions.  Even  vhen  the  database  is  not  fully  redun¬ 
dant,  if  most  transactions  are  write  transactions  and  local  delay  is 
high  relative  to  the  communication  delay,  the  Basic  Primary  Copy  algo¬ 
rithm  still  performs  better  than  the  Basic  Timestamp,  Multiple  Version 
Timestamp,  DDM,  and  Basic-Optimistic  algorithms,  because  the  latter 
abort  write  transactions  frequently.  However,  vhen  the  database  is 
fully  redundant,  the  Basic  Primary  Copy  algorithm  requires  more  communi¬ 
cation  messages  than  the  Basic  Timestamp,  Multiple  Version  Timestamp, 
DM,  and  Basic  Optimistic  algorithms.  Thus,  except  for  the  cases  above, 
the  extra  communication  messages  required  by  the  Basic  Primary  Copy 
algorithm  make  its  performance  worse  than  that  of  the  Basic  Timestamp, 
Multiple  Version  Timestamp,  DDM,  and  Basic-Optimistic  algorithm  in  this 
communication  bound  environment. 

The  timestamp  based  algorithms  perform  best  vhen  the  database  is 
fully  redundant,  then  read-only  transactions  incur  no  communication 
delay  and  complete  quickly.  The  read  phase  of  write  transactions  also 
completes  quickly.  When  read-only  transactions  and  the  read  phase  of 
write  transactions  complete  quickly,  conflicts  between  read-only  and 
write  transactions  that  result  in  abortion  of  the  write  transactions  is 
reduced.  Thus,  unnecessary  transaction  abortion  is  reduced. 

The  DDM  method  avoids  conflicts  between  read-only  transactions  and 
write  transactions,  but  it  pays  with  sure  abortions  of  write  transac¬ 
tions  at  transaction  end.  Thus,  vhen  most  transactions  are  read-only, 
it  performs  very  well.  The  higher  throughput  of  read-only  transactions 


make  up  for  the  extra  abortion  of  write  transactions.  Notice  that  DDM 
requires  a  extra  round  of  comsunication  messages  for  rfead-only  transac¬ 
tions  when  the  database  is  not  fully  redundant.  Then  its  performance 
degrades . 

The  Basic-Optimistic  algorithm  also  performs  well  when  most  tran¬ 
sactions  are  read-only;  then  read-only  transactions  and  the  read  phase 
of  write  transactions  complete  quickly.  Otherwise  it  performs  poorly 
because  the  system  is  eventually  saturated  with  many  long  write  transac¬ 
tions  that  later  abort. 

The  Hait-Oie  algorithm  performs  as  well  as  the  Baeic  Timestamp 
algorithm  when  most  transactions  are  write  transactions,  but  not  as  well 
when  most  transactions  are  read-only  transactions.  Since  the  Wait-Die 
algorithm  needs  no  overhead  for  maintaining  data  item  timestamps,  it  is 
preferable  to  the  timestamp  based  algorithms  if  most  transactions  are 
write  transactions. 

The  Basic  &  Basic,  Primary  Copy  &  Primary  Copy,  and  Primary  Site  & 
Primary  Site  algorithms  perform  poorly  because  they  require  more  commun¬ 
ication  messages  than  other  algorithms.  Communication  overhead  is 
expensive  in  this  communication  bound  environment. 

To  summarize,  in  this  environment  transaction  blocking  is  better 
than  transaction  abortion,  and  early  lock  conflict  detection  is  better 
than  late  detection.  However,  some  algorithms  that  use  these  two  stra¬ 
tegies  may  not  perform  well  in  some  cases  because  they  require  extra 
communication  messages. 

6.5  Conclusion 

We  found  that  five  of  the  twelve  algorithms  perform  best  in  various 
system  environments:  Basic  Timestamp,  Multiple  Version  Timestamp,  DIM, 
Basic  Optimistic,  and  Basic-Primary  algorithms. 

When  most  transactions  are  short,  concurrency  control  algorithms 
that  abort  conflicting  transactions  (such  as  Basic  Timestamp,  Multiple 
Version  Timestamp  algorithms)  perform  better  than  algorithms  that  block 
conflicting  transactions  (such  as  the  Basic  Primary  algorithm).  In  this 
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Figure  6.6  Performance  Comparison:  Long 

Transactions  &  Communication  Bound 


environment,  transactions  conflict  rarely;  and  when  they  do  conflict, 
the  blocking  transactions  tend  to  be  longer  than  the  average  transaction 
size  and  blocking  delay  long  [LINN83].  If  a  two-phase  locking  algorithm 
must  be  used,  algorithms  that  delay  lock  conflict  checking  (such  as  the 
DDM  and  the  Basic  Optimistic  algorithms)  perform  better  than  thoBe  that 
expedite  lock  conflict  checking  (such  as  the  Basic  Primary  algorithm). 
Unless  the  communication  bandwidth  is  very  high,  communication  delay  can 
devastate  system  performance;  thus,  the  designer  should  reduce  communi¬ 
cation  delay  by  locally  controlling  and  accessing  data  as  much  as  possi¬ 
ble. 


The  issue  of  balancing  communication  delay  against  data  distribu¬ 
tion  and  replication  is  part  of  the  complex  problem  of  distributed  data¬ 
base  design.  Distributed  database  design  must  also  take  into  account 
the  issues  of  distributed  query  processing  and  distributed  database 
reliability,  and  is  beyond  the  scope  of  this  handbook. 

Behavior  of  systems  that  have  long  transactions  is  very  different 
from  that  of  systems  that  have  short  transactions.  Long  transactions 
degrade  system  performance  very  quickly  because  they  have  store  transac¬ 
tion  conflicts.  Since  only  a  fraction  of  these  conflicts  results  in 
deadlocks,  concurrency  control  algorithms  that  use  transaction  blocking 


often  perform  better  than  tboee  that  uae  transaction  abortion  indiscrim- 
inately.  Moreover,  concurrency  algor ithms  that  detect  transaction  con¬ 
flict  earlier  often  perform  better  than  those  that  detect  transaction 
conflict  later*  The  effect  of  communication  delay  on  the  performance  of 
a  system  that  has  long  transactions  is  even  more  devastating  than  the 
effect  on  a  system  that  has  short  transactions.  Thus  the  designer  must 
reduce  communication  delay  as  much  as  possible  by  controlling  and 
accessing  data  locally. 

However,  no  matter  which  concurrency  algorithm  the  dee if  -r  uses,  a 
system  that  has  long  transactions  always  performs  worse  t  ~  system 
that  has  short  transactions.  The  designer  should  design  tran  .ions  to 
access  as  much  data  in  parallel  as  possible,  and  to  break  long  transac¬ 
tions  into  shorter  transactions.  Long  transactions  that  cannot  be  bro¬ 
ken  into  shorter  ones  must  be  executed  in  background  mode. 

Our  performance  study  shows  that  no  one  algorithm  performs  best  in 
all  system  and  application  environments.  If  the  system  environment  is 
stable,  the  database  designer  can  select  one  algorithm  that  performs 
best  in  the  environment.  If  the  system  environment  is  not  stable,  the 
database  designer  can  assign  different  weights  to  different  environments 
according  to  how  often  the  system  stays  in  each  environment.  The  data¬ 
base  designer  then  selects  the  algorithm  that  has  the  best  weighted 
average  performance. 

From  the  results,  we  can  also  conclude  that  the  best  algorithm 
would  be  one  that  could  be  adjusted  by  the  system  administrator  accord¬ 
ing  to  the  environment.  The  administrator  would  adjust  the  algorithm  to 
use  transaction  abortion  and  delay  lock  conflict  detection  whenever 
transactions  are  short,  and  to  use  transaction  blocking  and  detect  lock 
conflicts  as  soon  as  possible  whenever  transactions  are  long.  The  adju¬ 
stable  algorithm  would  also  alternate,  depending  on  the  load  on  the  com¬ 
munication  channel,  between  algorithms  that  have  more  localized  control 
and  algorithms  that  have  more  distributed  control. 
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7.  Conclusion 


The  DDB  Control  and 'Allocation  Project  has  set  out  to  achieve  the 
following  objectives: 

1.  Review  the  distributed  concurrency  control  research  published  in  the 
literature  and  incorporate  that  research  into  the  taxonomy  of  the 
distributed  database  concurrency  control  algorithms.  Based  on  this 
taxonomy,  we  would  develop  a  new  framework  for  distributed  database 
concurrency  control. 

2.  Develop  new  distributed  database  concurrency  control  algorithms 
using  the  framework  developed  in  1. 

3.  Simulate  the  performance  of  the  distributed  database  concurrency 
control  algorithms  that  are  found  to  be  dominant  in  the  previous 
study. 

4.  Build  an  analytical  model  of  distributed  database  concurrency  con¬ 
trol. 

5.  Survey  the  current  studies  of  reliability  and  recovery  of  distri¬ 
buted  database  systems  and  the  analysis  of  published  algorithms. 

6.  Develop  a  framework  for  reliability  and  recovery  of  distributed 
database  systems. 

7.  Consolidate  the  results  of  the  previous  tasks  into  a  system 
designer's  handbook. 

We  have  achieved  these  objectives,  and  the  results  are  described  in  this 
final  technical  report. 

The  first  objective  is  achieved  by  means  of  the  framework  discussed 
in  Section  II  of  Volume  I.  The  framework  facilitates  the  taxonomy  of 
distributed  concurrency  control  algorithms  by  identifying  the  essential 
component-  functions  of  distributed  concurrency  control  mechanisms.  This 
framework  is  an  excellent  basis  for  further  research  in  the  standardiza¬ 
tion  of  distributed  concurrency  control  architecture. 


The  second  objective  of  developing  nev  algorithms  using  this  frame¬ 
work'  is  achieved  by  the  nev  distributed  concurrency  control  algorithms 
described  in  Section  III  of  Volume  I.  In  this  section,  new  algorithms 
that  store  and  use  older  versions  of  data  items  are  described. 

The  third  objective  of  simulating  and  evaluating  the  performance  of 
distributed  database  concurrency  control  algorithms  is  achieved  by  a 
series  of  reports  in  the  second  volume.  Sections  II  through  V  report 
the  relationship  between  the  performance  of  various  algorithms  and  the 
system  parameters.  Section  VI  summarizes  the  simulation  results  and 
compares  the  performance  of  twelve  algorithms.  The  results  of  the 
second  volume  serve  as  an  excellent  basis  for  designing  a  distributed 
database  designer's  aid.  The  Designer  Aid  would  help  the  system 
designer  to  design  distributed  transactions,  partition  the  database  into 
fragments,  replicate  and  distribute  the  fragments,  and  choose  the  con¬ 
currency  control  algorithm  that  performs  best  in  his  system  environment. 

The  forth  objective  of  analytical  modeling  of  the  distributed  con¬ 
currency  control  algorithms  is  achieved  through  the  analytical  models 
described  in  Sections  IV  and  V  of  Volume  I. 

The  survey/study  of  reliability  and  recovery  of  distributed  data¬ 
base  systems  that  achieves  the  fifth  objective  is  reported  in  Sections 
VI  through  IX  of  Volume  I  and  in  the  third  semiannual  technical  report. 
Because  the  subject  is  relatively  unexplored,  only  a  few  algorithms  were 
reported.  To  discover  nev  algorithms  further  research  is  needed. 

A  framework  for  the  reliability  and  recovery  of  a  distributed  data¬ 
base  system  achieving  the  sixth  objective  is  described  in  Section  VII  of 
Volume  I.  This  framework  captures  the  essential  components  of  existing 
reliability  and  recovery  algorithms.  But,  because  research  on  this  sub¬ 
ject  is  in  its  primitive  stage,  store  research  is  needed  to  refine  the 
framework  and  to  use  it  to  develop  more  efficient  algorithms.  Moreover, 
the  refined  framevork  should  become  a  basis  for  standardizing  of  distri¬ 
buted  reliability  and  recovery  architecture. 

Finally,  these  results  have  been  suaaarized  in  a  separate  distri¬ 
buted  database  system  designer's  handbook.  This  handbook  can  help  the 
designer  to  select  a  distributed  concurrency  control  algorithm  that  per- 


forms  best  in  his  system  environment.  Of  course,  an  automated  tool 
would  be  more  helpful  to  the  designer.  The  automated  tool  would  receive 
information  from  the  designer  about  bis  system  (e.g.,  system  and  appli¬ 
cation  parameters)  and  it  would  output  to  the  designer  information  about 
how  to  best  design  his  system. 

Overall  we  have  accomplished  what  we  set  out  to  do;  and  in  the  pro¬ 
cess  we  came  to  understand  more  fully  the  mechanism  of  concurrency  con¬ 
trol,  reliability,  and  recovery  of  distributed  database  systems.  The 
next  step  is  to  translate  these  results  and  this  understanding  into  a 
practical,  integrated  set  of  tools  that  aid  distributed  database 
designers,  and  into  a  standard  architecture  of  distributed  DBMS  that 
facilitates  the  interconnection  of  different  DBM8s. 


Notations  used  in  the  appendix  are  explained  here  and  in  the  figures. 

READ  THROUGHPUT:  average  number  of  read-only  requests  successfully 

processed  per  unit  tine  (excluding  requests  processed 
and  subsequently  aborted). 

WRITE  THROUGHPUT:  Average  number  of  write  requests  successfully 

processed  per  unit  time  (excluding  requests  processed 
and  subsequently  aborted). 

Average  Response  Per  Read  Request:  average  time  required  to  process 
a  read-only  request. 

Average  Response  Per  Write  Request:  average  time  required  to  process 
a  write  request. 

Basic  Basic  :  Basic  and  Basic  algorithm. 

Prmry  Prmry  :  Primary  Copy  and  Primary  Copy  algorithm. 

Cntrl  :  Primary  Site  and  Primary  Site  algorithm. 

Basic  Prmry  :  Basic  and  Primary  Copy  algorithm. 

Basic  Cntrl  :  Basic  and  Primary  Site  algorithm. 

Basic  Tstmp  :  Basic  Timestamp  algorithm. 

Mltpl  Versn  :  DDM  Multiple  Version  and  Optimistic  algorithm. 

Basic  Optms  :  Basic  and  Optimistic  algorithm. 
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*  Multiple  programming  levels  at  the  three  site  are  10/11/11 

{Multiple  programming  levels  at  the  three  site  are  16/8/8. 

Multiple  programing  levels  at  the  three  site  are  24/4/4. 

TZ  :  Average  no.  of  requests  per  transaction  (trqnsaction  si 

VZ  :  Total  nuaber  of  data  iteas  in  the  database 


ansaction  sitej 
(database  size; 


MP  :  Multiprogramming  level. 

R/CK+W)  :  Percentage  of  transactions  that  are  read-only. 

10 /Comm  :  Ratio  of  local  delay  to  communication  delay 
(excluding  queueing  delay). 

Vo.  of  Copy  :  Fraction  of  the  database  residing  at  sites  81,  82,  &  S3 

Figure  A. 3  Average  Response  Per  Read  Request 

8hort  Transactions  6  Communication  Bound 


|R/(a|  io  / 
l+Wj  I  Comm 


75Z  .4/1/2 

251  .4/1/2 

75Z  2/1/2 

25Z  2/1  2 

75Z  .4/1/8 

25Z  .4/1/8 

75Z  2/1/8 

25Z  2/1/8 

75Z  .4/1/2 

25Z  .4/1/2 

75Z  2/1/2 

25Z  2/1  2 

75Z  .4/1/8 

25Z  .4/1/8 

75Z  2/1/8 

25Z  2/1/8 

75Z  .4/1/2 

25Z  .4/1/2 

75Z  2/1/2 

25Z  2/12 

75Z  .4/1/8 

25Z  .4/1/8 

75Z  2/1/8 

25Z  2/1/8 

75Z  .4/1/2 

25Z  .4/1/2 

75Z  2/1/2 

25Z  2/1  2 

75Z  .4/1/8 

25Z  .4/1/8 

75Z  2/1/8 

25Z  2/1/8 


no.  of  coot 
SI  |S2  IS3 


1  1/2  1/2 
1  1/2  1/2 
1  1/2  1/2 
1  1/2  1/2 
1  1/2  1/2 
1  1/2  1/2 
1  1/2  1/2 
1  1/2  1/2 
2/3  2/3  2/3 
2/3  2/3  2/3 
2/3  2/3  2/3 
2/3  2/3  2/3 
2/3  2/3  2/3 
2/3  2/3  2/3 
2/3  2/3  2/3 
2/3  2/3  2/3 
1  1/2  1/2 
1  1/2  1/2 
1  1/2  1/2 
1  1/2  1/2 
1  1/2  1/2 
1  1/2  1/2 
1  1/2  1/2 
1  1/2  1/2 
2/3  2/3  2/3 
2/3  2/3  2/3 
2/3  2/3  2/3 
2/3  2/3  2/3 
2/3  2/3  2/3 
2/3  2/3  2/3 
2/3  2/3  2/3 
2/3  2/3  2/3 


Cntrl  I 
Total  I 

iTI/ST 

.93/2.7 
$.6/1.9 
.91/2.7 
3. 3/1.1 
.39/1.2 
3, 1/1.1 
.40/1.2 
2. 6/. 87 
.66/1.9 
2.6/. 85 
.75/1.9 


2. 9/3.1 
6. 7/6.3 
4. 0/4.1 
6. 9/7.0 
5. 9/5.3 
16/14 
6. 8/6.7 
14/1 
10/9 
11/1 
10/9 
11/1 


Baaic 

Prmry 


8. 9/2.9 
.95/2.8 
6. 5/2.1 
.98/2.9 
3. 8/1. 2 
.41/1.2 
3. 6/1.2 


1.9/3. 5 
1.1/7 .3 
3. 5/3.6 
2. 4/7. 6 
4.3/16 
5.5/6. 9 


TZ  ■  Average  number  of  requests  per  transaction  (transaction  site). 

DZ  *  Total  number  of  data  items  in  the  database  (database  sixe). 

HP  ■  Multiprogramming  level. 

R/(R+W)  “  Percentage  of  transactions  that  are  read-only. 

10/Comm  ■  local  delay/message  communication  delay/data  communication  delay 
no.  of  copy  ■  Fraction  of  the  database  residing  at  each  site. 

Figure  A. 5  Through-Put  (Read/Write):  Short  Transactions 
&  Communication  Bound 
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procsssing  daisy  ara  alaulatad: 
aassags  procasslng  daisy  and  data  procsssing  daisy. 
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Assumptions: 

Queueing  for  looal  processing  is  simulated. 

TWo  kinds  of  looal  processing  delay  are  simulated: 

_  message  processing  delay  and  data  processing  delay. 

The  average  round  trip  communication  la  fixed  at  1 
The  message  processing  delay  la  fixed  at  58  of  the 
55  of  round  trip  communication  delay 
satlo  of  data  processing  *  message  processing  delay  Is  10 
The  ratio  of  data  processing  delay  to  round  trip 
communication  delay  la  shown  in  oolune  '10 /Com' 

■otatlon: 

n  ■  Average  number  of  requests  per  transaction  (transaction  site}. 
£2  ■  Total  number  of  data  ltma  in  the  database  (database  site). 
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20/Com  a  Xatlo  of  looal  data  processing  delay  to  oommunloatlon 
_  ,  .  .delay  (excluding  queueing) . 

Database  Copies  ■  fraction  of  the  database  residing  at  each  site. 

figure  A. 7  Average  lesponae  Time  (Read/Vrlte): 
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•  Multiple  progressing  levels  at  tbs  three  site  are  10/11/11 
latlo  or  local  data  processing  4  Message  processing  delay  is 

Issusptlon: 

Queueing  for  local  processing  is  simulated. 

Two  kinds  of  local  processing  era  emulated: 

(■essage  and  data  processing). 

The  round  trip  oonnunl cation  Is  fixed  at  1 
The  local  Message  processing  delay  la  fixed  at 
58  of  the  round  trip  ooaauni cation  delay 
The  ratio  of  local  data  processing  delay  to  round  trip 
oossunl cation  delay  Is  shown  in  ooIums  '10/ Comb' 
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Rotation: 

TZ  *  Average  nuxber  of  requests  per  transaction. 

DZ  *  Total  nunber  of  data  ltens  In  the  database. 

HP  *  Multiple  progrannlng  level. 

R/V  «  Ratio  of  read-only  to  write  transactions. 

IO/Con  «  Ratio  of  local  data  processing  delay  to 

oonnunl oati on  delay  (excluding  queueing). 
Database  Copies  *  Fraction  of  the  database  at  each  site. 


Figure  A.6  Through-Put  (Read/Write):  Long 
Transaction  Loaded  A  10  Bound 
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*  Multiple  programming  levels  at  the  three  site  are  10 
Ratio, of  local  data  processing  &  message  processing  de 

Assumption: 
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wo  kinds  of  local  processing  are  simulated: 

(message  and  data  processing). 

The  round  trip  communication  is  fixed  at  1 
The  local  message  processing  delay  is  fixed  at 
5Z  of  the  round  trip  communication  delay 
The  ratio  of  local  data  processing  delay  to  round  trip 
communication  delay  is  shown  in  colume  "ZO/Comm" 

Notation: 

TZ  ■  Average  number  of  requests  per  transaction. 

DZ  “  Total  number  of  data  items  in  the  database. 

MP  ■  Multiple  programming  level. 

R/W  *  Ratio  of  read-only  to  write  transactions. 

IO/Com  *  Ratio  of  local  data  processing  delay  to 

communication  delay  (excluding  queueing). 
Database  Copies  “  Fraction  ox  the  database  at  each  site. 


Figure  A. 9  Average  Response  Time:  Long 
Transaction  Loaded  &  10  Bound 
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Queueing  for  communication  channel  is  simulated. 

Only  one  kind  of  local  processing  is  simulated. 

The  average  round  trip  communication  is  fixed  at  1 
The  ratio  of  local  data  processing  delay  to  round  trip 
communication  delay  is  shown  in  colume  IO/Comm" 

Notation: 

TZ  “  Average  number  of  requests  per  transaction. 

DZ  ■  Total  number  of  data  items  in  the  database. 

HP  *  Multiple  programming  level. 
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Database  Copies  ■  Fraction  of  the  database  at  each  site. 

Figure  A. 10  Through-Put  (Read/Write):  Long  Transaction 
Loaded  &  Communicaton  Bound 


MP|R/W|I0/ (Database 
I  iComlCopies 


*  .25  .2  1  1  1 

*  .75  .2  1  1  1 

*  .25  2  1  1  1 

*  HI  2.2  lh  2)3  2)3 

*  .75  .2  2/3  2/3  2/3 

*  .25  1  2/3  2/3  2/3 

*  .75  1  2/3  2/3  2/3 
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*  Multiple  programming  levels  at  the  three  site  are  10/11/11. 
Assumption: 

Queueing  for  communication  channel  is  simulated. 

Only  one  kind  of  local  processing  is  simulated. 

The  average  round  trip  communication  is  fixed  at  1 
The  ratio  of  local  data  processing  delay  to  round  trip 
communication  delay  is  shown  in  column  IO/Comm 

Notation: 

TZ  "  Average  number  of  requests  per  transaction. 

DZ  ■  Total  number  of  dfta  items  m  the  database. 

HP  ■  Multiple  programming  level. 

R/W  “  Ratio  of  read-only  to  write  transactions. 

XO/Com  ■  Ratio  of  local  processing  delay  to  communication 
delay  (excluding  queueing  delay). 

Database  Copies  ■  Fraction  of  the  database  at  each  site. 


Figure  A. 11  Average  Response  Time  (Read/Write) 
Long  Transaction  4  Communication  B 
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MISSION 
of 

Rome  Air  Development  Center 

KAVC  plan s  and  executes*  research,  development,  tut  and 
4 elected  acquisition  programs  In  4a pport  of  Command,  Control 
Communication*  and  Intelligence  [C3 1)  activities*.  Technical 
and  engineering  (support  within  area*  of  technical  competence 
l 4  provided  to  ESP  Program  Of {Ice*  IPO 4)  and  other  ESP 
element*.  The  principal  technical  mission  area*  are 
communication*,  electromagnetic  guidance  and  control,  sur¬ 
veillance  oh  ground  and  aerospace  objects,  Intelligence  data 
collection  and  handling,  Information  system  technology, 
Ionospheric  propagation,  solid  state  sciences,  microwave 
physics  and  electronic  reliability,  maintainability  and 
compatibility. 


